Find high moving stocks before they move using anomaly detection and machine learning. Surpriver uses machine learning to look at volume + price action and infer unusual patterns which can result in big moves in stocks.
Path | Description |
---|---|
surpriver | Main folder. |
└ dictionaries | Folder to save data dictionaries for later use. |
└ figures | Figures for this github repositories. |
└ stocks | List of all the stocks that you want to analyze. |
data_loader.py | Module for loading data from yahoo finance. |
detection_engine.py | Main module for running anomaly detection on data and finding stocks with most unusual price and volume patterns. |
feature_generator.py | Generates price and volume return features as well as plenty of technical indicators. |
You will need to install the following package to train and test the models.
You can install all packages using the following command. Please note that the script was written using python3.
pip install -r requirements.txt
You can also use docker if you know what it is and have some knowledge on how to use it. Here are the steps to run the tool with docker.
docker build . -t surpriver
<C:\\path\\to\\this\\dir>
with the directory you are working in.docker-compose up -d
docker exec -it surpriver
to your command line.If you want to go ahead and directly get the most anomalous stocks for today, you can simple run the following command to get the stocks with the most unusual patterns. We will dive deeper into the command in the following sections.
python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 0 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 1 --is_test 0 --future_bars 0
This command will give you the top 25 stocks that had the highest anomaly score in the last 14 bars of 60 minute candles. It will also store all the data that it used to make predictions in the dictionaries/data_dict.npy folder. Below is a more detailed explanation of each parameter.
python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 1 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 0 --is_test 0 --future_bars 0 --output_format 'CLI'
Notice the change in is_save_dictionary and is_load_from_dictionary.
Here is an output of how a single prediction looks like. Please note that negative scores indicate higher anomalous and unusual patterns while positive scores indicate normal patterns. The lower the better.
Last Bar Time: 2020-08-25 11:30:00-04:00
Symbol: SPI
Anomaly Score: -0.029
Today Volume (Today = Date Above): 313.94K
Average Volume 5d: 206.53K
Average Volume 20d: 334.14K
Volatility 5bars: 0.013
Volatility 20bars: 0.038
Future Absolute Sum Price Changes: 72.87
If you are suspicious of the use of Machine Learning and Artificial Intelligence in trading, you can actually test the predictions from this tool on historical data. The two most important command line arguments for testing are is_test and future_bars. If the former one is set to 1 and the later one is set to anything more than 5, the tool will actually leave that amount of data for analysis purposes and use the data prior to that for anomalous predictions. Next, it will look at that remaining data to see how well the predictions did. Here is an example of a scatter plot from the following command.
python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 0 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 1 --is_test 1 --future_bars 25
If you have already generated the data dictionary, you can use the following command where we set is_load_from_dictionary to 1 and is_save_dictionary to 0.
python detection_engine.py --top_n 25 --min_volume 5000 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 1 --data_dictionary_path 'dictionaries/data_dict.npy' --is_save_dictionary 0 --is_test 1 --future_bars 25
As you can see in the image above, the anomalous stocks (score < 0) usually have a higher absolute change in the future on average. That proves that the predictions are actually for those stocks that moved more than average in the next few hours/days. One question arises here, what if the tool is just picking the highest volatility stocks because those would yield high future absolute change. In order to prove that it's not the case, here is the more detailed description of stats you get from the above command.
--> Future Performance
Correlation between future absolute change vs anomalous score (lower is better, range = (-1, 1)): **-0.23**
Total absolute change in future for Anomalous Stocks: **89.660**
Total absolute change in future for Normal Stocks: **43.000**
Average future volatility of Anomalous Stocks: **0.332**
Average future volatility of Normal Stocks: **0.585**
Historical volatility for Anomalous Stocks: **2.528**
Historical volatility for Normal Stocks: **2.076**
You can see that historical volatility for normal vs anomalous stocks is not that different. However, the difference in total absolute future change is double for anomalous stocks as compared to normal stocks.
You can now specify which data source you wold like to use along with which stocks list you would like to use.
python detection_engine.py --top_n 25 --min_volume 500 --data_granularity_minutes 60 --history_to_use 14 --is_load_from_dictionary 0 --data_dictionary_path 'dictionaries/feature_dict.npy' --is_save_dictionary 1 --is_test 0 --future_bars 0 --data_source binance --stock_list cryptos.txt
We will try to post the top 25 results for a single set of parameters every week.
The tool only finds stocks that have some unusual behavior in their price and volume action combined. It does not predict which direction the stock is going to move. That might be a feature that I'll implement in the future but for right now, you'll need to look at the charts and do your DD to figure that out.
A product by Tradytics
Copyright (c) 2020-present, Tradytics.com