byunsy / financial-forecasting

Uses a combination of LSTM (Long Short-term Memory) and CNN (Convolutional Neural Network) to accurately predict a stock's closing price based on a time-window of 20 business days.
5 stars 3 forks source link

Model predicts current value instead of future value #1

Open ghost opened 3 years ago

ghost commented 3 years ago

Hey, thank you for sharing this project! It works extremely well in all other respects except it appears to predict the current price instead of the future price. This is evident in the graph in the Jupyter workbook and becomes more obvious with more epochs. I can't seem to figure out where the problem lies as the targets in the windowed dataset appear to be correct and it doesn't seem to be a chart alignment issue either.

byunsy commented 3 years ago

Hi, thanks for opening an issue! I'm currently a student learning about time series data and LSTMs, so I might not be able to provide the most accurate answer, but I will try my best to help you in some ways.

Firstly, the project here uses quite a simple model that takes a time-window of 20 business days to predict the next time step. This means that we take, for example, the closing prices from 2020-06-01 to 2020-06-26 (20 weekdays in total) and predict the closing price for 2020-06-29 (the next weekday). Hence, the model is using 'past' prices to predict 'future' prices.

I think you might be referring the fact that this model does not actually forecast the future beyond the dataset we currently have (eg. predicting closing prices from 2021-02-01 to 2021-02-05, which have yet to come). I'm also aware of this limitation, and I'm still learning and searching for best ways to implement a model that can perform such forecasts.

My first naïve guess is that we could possibly keep appending our predicted prices to the historic dataset. We can then use these appended data points to recursively predict future values beyond the dataset we currently have. However, this will most likely have errors that accumulate over time since now we are using predicted values to keep predicting future values.

I hope this helps.

Thank you!

ghost commented 3 years ago

Thank you for taking the time to answer so thorougly and my apologies for not having described the problem quite as well! Let's see if I'm able to express the issue better.

Based on your explanation, my understanding is that if you feed the model data such as [1, 2, 3, 4, 5 ... 20], we would expect it to forecast 21. However, it currently appears to forecast 20 - the last value in the dataset. I noticed this when I used the trained model to predict a future value and seemed to get the results one step late. I believe this can be seen in the chart as the prediction always being one step late from the actual value.

image

Replacing np.asarray(y_test)[20:] with np.asarray(y_test)[19:] results in perfect alignment of the graph (also demonstrating the excellency of your code). How could we achieve the same end result without manipulating the graph but correcting the underlying cause? I will admit though that there's a good chance I have just misunderstood something.

By the way, if you were ever interested in looking more into forecasting further in the future, a Tensorflow tutorial seems to have a surprisingly simple solution for that: https://www.tensorflow.org/tutorials/structured_data/time_series#multi-step_models

byunsy commented 3 years ago

Thank you for the clarification! I see what you mean now.

You definitely have a good point, and it seems like it's something I didn't think about carefully enough. I initially thought it was simply just a matter of visual representation, like off-by-one error at 20th and 21st index in the actual/prediction dataset. But as I looked more closely into what you were saying, I do actually think it might be something wrong with the model itself. And my best guess is that perhaps the model is overfitting and predicting the next value based solely on the previous time step, hence producing the error that you've raised.

I think the problem is very similar to: https://stackoverflow.com/questions/54368686/lstm-having-a-systematic-offset-between-predictions-and-ground-truth. Unfortunately, I couldn't find a quick answer to resolve this problem.

I will definitely spend some time exploring other models and tweaking hyperparameters to build something that can reduce these kinds of phenomena. If I can get a better answer, I will give you an update.

Thank you so much for this issue. Otherwise I wouldn't have learned about this problem. Also, thank you for the link about multi-step-models! That's also something I will surely look into.

ghost commented 3 years ago

Excellent, I'm glad you were able to pinpoint the source of the problem! I will have a look at other solutions as well and check back to see if you've made further progress. No pressure though. Thank you for your effort!