Question about the entropy loss

AI4Finance-Foundation / FinRL

FinRL: Financial Reinforcement Learning. 🔥

https://ai4finance.org

MIT License

10.09k stars 2.43k forks source link

Question about the entropy loss #940

Open caserzer opened 1 year ago

caserzer commented 1 year ago

using the Stock_NeurIPS2018_SB3.ipynb notebook , default parameters the results seems to be ok but after checking the tensorboard log , I found something confused

Why the entropy loss is negative and keep gowing?
The train/reward indicate the agent not learning something useful ?
After checking the actions of the agents action on the trade dataset , the actions almost the same , buy some shares and keep holding....

zhumingpassional commented 1 year ago

it may be caused by no normalization. it also depends on the distribution of datasets.
if most stocks are decreasing, the reward may be decreasing
it depends on the trending of stocks and hyper-param tuning. if you set a different training/trading period, e.g., most stocks are decreasing, the result may be different.

caserzer commented 1 year ago

Thank for your comment.

data normalization . The StockTradingEnv using the close price to calculate the reward , so should I normalize the tech indicator?
Understanding, but what I mean is that the train/reward convergence problem

zhumingpassional commented 1 year ago

no. I mean the policy is not normalized
with respect to convergence, we generally use the cumulative rewards, in an epoch, not reward

caserzer commented 1 year ago

Would you please explain the "the policy is not normalized"