Closed AlexZou66 closed 2 years ago
When the stock dimension is large, the parameters should be fine-tuned. It may be hard to converge, since there are more states and actions.
You can split the stocks into small sets, which may be a method. I do not know if you use the same parameters. You can try.
It looks like StockTradingEnv Class in env_stocktrading_China_A_shares.py will give really high values of obs and reward when using extreme actions with over 50 stocks. This is not a problem when the stock size is under 15 even if the initial action choice is extreme. The first picture is the stock size of 50 and the second picture is the stock size of 15. "r" represents the reward, and since it is a batch of data, so I calculate max, min, std, and mean for the batch. This makes training really hard since obs and rewards are high. I tried to tune the hyperparameters but it is hard to converge. I wonder if any thoughts on being able to consider a bigger stock size.