Closed jd0713 closed 7 years ago
Are you using one of my notebooks or your own? And what do you mean by the test set or are you applying it to recent data?
This happened for me during training when:
And for your list of ideas I think 2 and 5 couldn't be the cause since it makes it easier for it to give varied predictions. It shouldn't be 3. since it shouldn't be learning on the test set.
But yeah I reckon it could be 1., 4., or 6.
I used your notebook, and data you uploaded. I mean that there are only slight difference between equal-weighted portfolio and learned policy. Also, there are almost no rewards gained while training. I think that portfolio weight, accumulated return, like these should be different if 'something' is learned, and result doesn't seemed to. I am wondering if there's way to model learn 'something'
In short, why there's no episode reward?
I am also curious if this phenomenon is derived from limit of observation space. Because observation space is prices divide by last open price, it can be over 1. Thanks.
Because I'm newbie at deep learning, my question might be vague.
I ran into similar problems, I found that the DDPG model was less likely to learn, but the VPG notebook was more stable. So my advice would be to try the VPG notebook and if that doesn't work maybe start of with with a simpler project. Deep reinforcement learning is pretty new and lots of libraries are broken and lots of algorithms are unstable. So maybe start with image classification projects if this is frustrating you?
If your determined to do this though, it's probably worth reading the papers belonging to the algorithms since they explain all the parameters and limitations.
This project didn't get good results for me (although another guy did a pytorch implementation and got decent results, but I'm not sure if he plans on publishing it). I'm wondering if there is a problem in my data or my environment since I tried many algorithms but kept the data and environment constant. Or maybe it's just difficult dataset.
Thanks for your nice comment. And I also read conversation between you and goodulusaurs after you comment. I should better use DPG and ensemble them which is just same as the paper. I will update if I have good result! Thank you again!
I'd be interested to know if you get any results using DPG! Good luck.
As you know, portfolio weights appear to static at test set. Why didn't model learn so much?
I am assuming several reasons.
because author said that EIIE is there core idea in this paper, I think if I solve 1 and 2, there will be some improvement.
Will you give some idea of yours?
Thank you very much.