Closed ghost closed 6 years ago
It's good to be skeptical of everything but even John Schulman of openai said that RL currently has a problem where paper are very hard to produce due to subtle differences. Meanwhile the author has put up their code. It's possible that large institutions have come up with things but kept it as a commercial secret. Large hedge funds certainly try.
So I can't say for sure either way right now. If I had to guess then I think it works, although I'm not sure about extremely large returns.
Great, my teammates can't do shit and I probably don't have more time running down this rabbit hole (I learned a lot though). Sorry for intruding into your repo, you can close this issue any time. But I thought about the problems:
One other things I noticed is that going through the data in sequential order seems to help. Perhaps it helps reinforce temporary patterns.
Yeah it's a frustrating problem! And RL is quite tricky since there aren't many kaggle competitions to learn tricks from.
I'm currently doing a course project and the paper has very similar ideas to ours. However our network is not learning much at all. Of course this might be because we trained our network on 110 stocks with 10 in each batch and tested it on another 20 stocks. Training data and test data are all from the S&P 500 over the past 3 years. We have built a complete evaluation pipeline and we are getting crappy results.
Although our results are quite preliminary, it is likely that since in the paper they operated on a certain set of cryptocurrency the network kind of remembers the scores (e.g. how well each currency can perform) and they got very good results with these 'memory' stored in the last few fc layers?
I'm not sure though since I'm new to deep learning and this is my first time using deep reinforcement learning. But it seems to me that data in the finance world has too much noise and the output of neural networks can easily be swayed with its complex model and, please allow me to put it in this way (although it is a continuous setting instead of a classification/discrete problem), decision boundaries. If neural network can solve the portfolio management problem so well probably few students at a less well-known university would not be the first to come up a with a good application in finance.
I'll keep you posted for any later development as our project progress. In the meantime, if you'd like, I am very interested to hear your thoughts on the subject. Thank you!