AI4Finance-Foundation / FinRL-Meta

FinRL­-Meta: Dynamic datasets and market environments for FinRL.
https://ai4finance.org
MIT License
1.24k stars 576 forks source link

Obs and reward explodes when initializing with unreasonable action with stock size over 50 #174

Closed AlexZou66 closed 2 years ago

AlexZou66 commented 2 years ago

It looks like StockTradingEnv Class in env_stocktrading_China_A_shares.py will give really high values of obs and reward when using extreme actions with over 50 stocks. This is not a problem when the stock size is under 15 even if the initial action choice is extreme. The first picture is the stock size of 50 and the second picture is the stock size of 15. "r" represents the reward, and since it is a batch of data, so I calculate max, min, std, and mean for the batch. image image (1) This makes training really hard since obs and rewards are high. I tried to tune the hyperparameters but it is hard to converge. I wonder if any thoughts on being able to consider a bigger stock size.

zhumingpassional commented 2 years ago

When the stock dimension is large, the parameters should be fine-tuned. It may be hard to converge, since there are more states and actions.

You can split the stocks into small sets, which may be a method. I do not know if you use the same parameters. You can try.