renatovotto / Nostradamus

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.
41 stars 28 forks source link

Next day forecast #1

Open saeed7733 opened 2 years ago

saeed7733 commented 2 years ago

Hi Renato, Thanks for sharing your work, it’s a great concept. I had a question, can you advise me on how can I get the next day’s forecast? I tried to fit and predict the model but wasn’t able to. I appreciate your help. Thanks.

saeed7733 commented 2 years ago

I noticed that in the compute_signals function you have made a mistake that you are taking a price from the future to conclude sth in the past (lookahead bias). I corrected it and it seems the smooth equity curve goes away.

CAPSAL2000 commented 2 years ago

Do you mean that The final results are not correct?

saeed7733 commented 2 years ago

Obviously

CAPSAL2000 commented 2 years ago

What an answer (idio.), just need to say yes or not.

renatovotto commented 2 years ago

Hi @saeed7733, Thanks for your feedback and sorry for the delay in replying to your questions. 1) Predicting the next day price (from new input data, i.e. current prices and twitter posts): this is not implemented in the code at the moment as the project is just a study and there is no implementation for live trading. However, with a bit of effort and development, it shouldn't take too long to build this. You basically need to: a) call the pipeline class to retrieve the data up to the current date (paying attention to give a starting date older enough to cover the longest rolling window of technical indicator, in the original code it's probably a moving average covering 200 days, or just joining rows for more recent days to an older data set if you did this recently) b) call a random forest classifier with the optimised parameters (included in the repo files, if you do not wish to perform the optimisation), fit and predict the output signal on the latest row, and you get the buy or sell signal you want. It is worth to note that you need to retrieve the last row of prices and tweets (most recent ones) to a current time that is before the market closing time if you wish to trade before the market close, or alternatively wait for the market close and use the signal in an auction, in which case you can use the daily close as default by the program.

2) Could you show me the change you brought into the compute_signals function to fix the issue you mentioned?

Meanwhile, thanks for your attention on my project. Regards, Renato

saeed7733 commented 2 years ago

Hi @renatovotto Thank you for getting back to me. First of all I want to thank you for sharing your hard work, you have written a very clean and structured code. 1 & 2 - I used a simple approach to derive the next day's signal by fitting the model to the trained data and using the last row of the features (X_test that includes all the indicators and the tweets scores) and used the predict function to get the predicted y (that is the next day's signal). since your code is using the today's and next days close to calculate the signal the last signal (y_test[-1]) changes depending on how the day closes and the sign of the signal for the last day changes also. I tested it and the signal changed multiple times. the smooth equity curve is the result of that. The backtest is calculated incorrectly due to using the same day's signal and returns. What I think your model might be able to is to predict multiple days ahead instead and in that way we might get some outperformance if the backtest returns are calculated correct.

Thanks and I hope you are enjoying your holiday :) Saeed.

renatovotto commented 2 years ago

Hi @renatovotto Thank you for getting back to me. First of all I want to thank you for sharing your hard work, you have written a very clean and structured code. 1 & 2 - I used a simple approach to derive the next day's signal by fitting the model to the trained data and using the last row of the features (X_test that includes all the indicators and the tweets scores) and used the predict function to get the predicted y (that is the next day's signal). since your code is using the today's and next days close to calculate the signal the last signal (y_test[-1]) changes depending on how the day closes and the sign of the signal for the last day changes also. I tested it and the signal changed multiple times. the smooth equity curve is the result of that. The backtest is calculated incorrectly due to using the same day's signal and returns. What I think your model might be able to is to predict multiple days ahead instead and in that way we might get some outperformance if the backtest returns are calculated correct.

Thanks and I hope you are enjoying your holiday :) Saeed.

Hi @saeed7733,

I don't think I am understanding your point, but I try to answer. In the compute_signals function, the relevant part, that looks like:

df['Signals'] = np.where(df['Close'] > df['Close'].shift(1), 1, 0) df['Signals'] = np.where(df['Close'] < df['Close'].shift(1), -1, df['Signals'])

compares the current row df['Close'] (current day) to the previous row df['Close'].shift(1) (previous day). and assigns the result of the inequality (signal) to the current row, resulting in a signal for current row given by the comparison of today and yesterday price. I am missing something?

Please let me know if you thing something is still off. Thanks, Renato

saeed7733 commented 2 years ago

Hi Renato, That is correct. However, as I noted a calculation probably related to the backtest is considering the same day's return for the corresponding day's forecast. I tested the code in multi periods of the day, before open, during the market hours and after. The signal changes during the day, it is fine if we use the last signal for the next day's return to calculate the performance. But we see the backtest performance changes, although if we have the signal from yesterday, if the market goes against the forecast then we have to see a decline in performance but we do not see that. Have you notice such a thing?

Thanks, Saeed.

renatovotto commented 2 years ago

I did not test on live data, so cannot really comment on checking during different times of the day. What's for sure is that you will definitely see the daily return going down if your position is wrong against the price change, but again, you'll need to provide that closing price to calculate the realised return. What I am saying is that you have a new return point only once you provide the closing price, be it at 12am or 4pm.