tradytics / surpriver

Find big moving stocks before they move using machine learning and anomaly detection
https://www.tradytics.com/
GNU General Public License v3.0
1.77k stars 330 forks source link

Same dictionary different results #10

Open shaggy63 opened 4 years ago

shaggy63 commented 4 years ago

Is there some sort of time decay in the score factor? If I run it now vs an hour from now using the same dictionary I'll get different Symbols, the Symbols which are the same have different scores. What's going on?

python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 1 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 0 --is_test 0 --future_bars 0 --output_format JSON

        [latest_date] => 2020-09-03 15:30:00-04:00
        [Symbol] => SHLO
        [Anomaly Score] => -0.090577053896931
        [Today Volume] => 191.8M
        [Average Volume 5d] => 4.4M
        [Average Volume 20d] => 1.6M
        [Volatility 5bars] => 0.039062185261644
        [Volatility 20bars] => 0.16062981405984

VS [latest_date] => 2020-09-03 15:30:00-04:00 [Symbol] => SHLO [Anomaly Score] => -0.089904823370707 [Today Volume] => 191.8M [Average Volume 5d] => 4.4M [Average Volume 20d] => 1.6M [Volatility 5bars] => 0.039062185261644 [Volatility 20bars] => 0.16062981405984

This was number 4 the first time: [latest_date] => 2020-09-03 15:30:00-04:00 [Symbol] => SBPH [Anomaly Score] => -0.064138750572796 [Today Volume] => 178.03K [Average Volume 5d] => 129.7K [Average Volume 20d] => 102.2K [Volatility 5bars] => 0.0067650775893322 [Volatility 20bars] => 0.090846754019149

7th the second time: [latest_date] => 2020-09-03 15:30:00-04:00 [Symbol] => SBPH [Anomaly Score] => -0.054286909962896 [Today Volume] => 178.03K [Average Volume 5d] => 129.7K [Average Volume 20d] => 102.2K [Volatility 5bars] => 0.0067650775893322 [Volatility 20bars] => 0.090846754019149

tradytics commented 4 years ago

One thing I can think of is the random initialization in the IsolationForest model which can cause some differences in results. Other than that, this shouldn't be happening. If the data changes the next time you run it, then it makes sense.

alg0trader commented 3 years ago

I see different results when loading from the saved data_dict.npy file. For example, truncating to 200 tickers:

Downloading Data

Surpriver has been initialized...
Data engine has been initialized...
Technical Indicator Engine has been initialized
  0%|          | 0/200 [00:00<?, ?it/s]Loading data for all stocks...
100%|██████████| 200/200 [00:31<00:00,  6.43it/s]
Last Bar Time: 2020-12-11 15:45:00-05:00
Symbol: ASAN
Anomaly Score: -0.055
Today Volume: 175.83K
Average Volume 5d: 225.24K
Average Volume 20d: 183.27K
Volatility 5bars: 0.201
Volatility 20bars: 0.264
----------------------
Last Bar Time: 2020-12-11 15:45:00-05:00
Symbol: EGHT
Anomaly Score: -0.047
Today Volume: 364.82K
Average Volume 5d: 258.57K
Average Volume 20d: 198.04K
Volatility 5bars: 0.223
Volatility 20bars: 0.395
----------------------
Last Bar Time: 2020-12-11 16:00:00-05:00
Symbol: BABA
Anomaly Score: -0.004
Today Volume: 547.97K
Average Volume 5d: 578.64K
Average Volume 20d: 554.24K
Volatility 5bars: 0.120
Volatility 20bars: 0.532
----------------------
Last Bar Time: 2020-12-11 15:45:00-05:00
Symbol: AEG
Anomaly Score: -0.000
Today Volume: 105.98K
Average Volume 5d: 123.84K
Average Volume 20d: 120.14K
Volatility 5bars: 0.007
Volatility 20bars: 0.011
----------------------
Last Bar Time: 2020-12-11 16:30:00-05:00
Symbol: AMC
Anomaly Score: 0.003
Today Volume: 645.2K
Average Volume 5d: 459.47K
Average Volume 20d: 653.37K
Volatility 5bars: 0.012
Volatility 20bars: 0.027
----------------------

Loading npy Data

Surpriver has been initialized...
Data engine has been initialized...
Technical Indicator Engine has been initialized
Loading data from dictionary (surpriver/dictionaries/data_dict.npy)
Last Bar Time: 2020-12-08 15:15:00-05:00
Symbol: AMOV
Anomaly Score: -0.106
Today Volume: 300
Average Volume 5d: 222.4
Average Volume 20d: 2.07K
Volatility 5bars: 0.854
Volatility 20bars: 0.746
----------------------
Last Bar Time: 2020-12-08 15:45:00-05:00
Symbol: ATV
Anomaly Score: -0.056
Today Volume: 100
Average Volume 5d: 446.8
Average Volume 20d: 2.18K
Volatility 5bars: 0.186
Volatility 20bars: 2.559
----------------------
Last Bar Time: 2020-12-11 15:45:00-05:00
Symbol: Y
Anomaly Score: -0.025
Today Volume: 4.39K
Average Volume 5d: 2.42K
Average Volume 20d: 6.83K
Volatility 5bars: 3.192
Volatility 20bars: 4.413
----------------------
Last Bar Time: 2020-12-11 15:45:00-05:00
Symbol: ASIX
Anomaly Score: 0.010
Today Volume: 10.96K
Average Volume 5d: 13.55K
Average Volume 20d: 10.93K
Volatility 5bars: 0.072
Volatility 20bars: 0.153
----------------------
Last Bar Time: 2020-12-11 15:45:00-05:00
Symbol: ALTG
Anomaly Score: 0.017
Today Volume: 2.75K
Average Volume 5d: 1.14K
Average Volume 20d: 3.9K
Volatility 5bars: 0.131
Volatility 20bars: 0.213
----------------------