_calculate_reward modify

AminHP / gym-anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)

MIT License

2.09k stars 459 forks source link

trade = False if .....(cut it short) if self._position == Positions.Short: step_reward += -price_diff * 10000 elif self._position == Positions.Long: step_reward += price_diff * 10000 return step_reward

I also had the problems: https://github.com/AminHP/gym-anytrading/pull/86#issuecomment-1483605097

after this change I could train SB3 - models again, tested 'stocks-v0': https://github.com/AminHP/gym-anytrading/pull/86/commits/7288a1e3f7089b477caf846ddcc80b60c3829b7c#diff-5d3f71bdaa90f138b62b611d8a6a0e90090f893152c2e313975a7bd6c43d8238R38

i found other problems, i could not always see learning progress. (tested with interday stock data) https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html

always normalize your observation space when you can, i.e., when you know the boundaries

i have only tried with 'diff' without 'prices' signal_features = diff #np.column_stack((prices, diff))

it works much better, PPO achieves more 'avg. rewards' for me this way sb3_predict

AminHP / gym-anytrading

_calculate_reward modify #87