AI4Finance-Foundation / FinRL

FinRL: Financial Reinforcement Learning. 🔥
https://ai4finance.org
MIT License
9.48k stars 2.31k forks source link

Question about the entropy loss #940

Open caserzer opened 1 year ago

caserzer commented 1 year ago

using the Stock_NeurIPS2018_SB3.ipynb notebook , default parameters the results seems to be ok image but after checking the tensorboard log , I found something confused image image

  1. Why the entropy loss is negative and keep gowing?
  2. The train/reward indicate the agent not learning something useful ?
  3. After checking the actions of the agents action on the trade dataset , the actions almost the same , buy some shares and keep holding....
zhumingpassional commented 1 year ago
  1. it may be caused by no normalization. it also depends on the distribution of datasets.
  2. if most stocks are decreasing, the reward may be decreasing
  3. it depends on the trending of stocks and hyper-param tuning. if you set a different training/trading period, e.g., most stocks are decreasing, the result may be different.
caserzer commented 1 year ago

Thank for your comment.

  1. data normalization . The StockTradingEnv using the close price to calculate the reward , so should I normalize the tech indicator?
  2. Understanding, but what I mean is that the train/reward convergence problem
zhumingpassional commented 1 year ago
  1. no. I mean the policy is not normalized
  2. with respect to convergence, we generally use the cumulative rewards, in an epoch, not reward
caserzer commented 1 year ago

Would you please explain the "the policy is not normalized"