Getting NaN backtest results for DDPG and TD3 algorithm for a single stock

AI4Finance-Foundation / FinRL

FinRL: Financial Reinforcement Learning. 🔥

https://ai4finance.org

MIT License

9.93k stars 2.4k forks source link

Getting NaN backtest results for DDPG and TD3 algorithm for a single stock #258

Closed Athe-kunal closed 3 years ago

Athe-kunal commented 3 years ago

I tried out few individual stocks like TSLA, GOOGL and AMZN. But I am encountering NaN backtest results for the same

Going through the results while training, I found this

Here the actor loss is critically high. Am I getting something wrong? I trained it for GOOGL, but I didn't get this issue with TSLA or DOW Jones 30. Also, I tried a bunch of other stocks like AAPL, AMZN etc. I got these NaN results because the final and initial portfolio value is same.

rayrui312 commented 3 years ago

Thanks for pointing it out. It seems to be caused by hyperparameters. You may turn down the learning rate and see if the there are still NaN values, e.g., you can set learning rate of DDPG to 1e-5 by:

model_ddpg = agent.get_model("ddpg", model_kwargs={'learning_rate':0.00001})

Athe-kunal commented 3 years ago

Yeah after re-running the experiments for individual stock 'AAPL', DDPG gave results but TD3 gave NaN values, so probably tuning the learning rate will ameliorate this issue. Thank you and I have attached the results and changing learning rate for TD3, did not solve the issue, probably would need further tuning https://colab.research.google.com/drive/1uu2_2v05kQKygleYKGbECw11rS4O4fre?usp=sharing

rayrui312 commented 3 years ago

Yeah. Basically, it is hard to get good results for different data using constant parameters. We are further tuning the hyperparameters now.

Athe-kunal commented 3 years ago

Hmm, it surely does require a lot of training. Also, if we add a try/except block for the Sharpe ratio, then we can get rid of NaN values and ZeroDivisionError while training. So like try: sharpe = ..... except: sharpe = 0.0 In my testing, I was zero division error while training, probably we can include this