Closed yuvaleck closed 3 years ago
Hey. Could you provide full code to replicate the issue? This sounds like custom gym environment issue, for which we do not offer such tech support unless there is a bug / proposal for enhancement in the library itself.
Is it possible to bound somehow observation in a stable-baseline? For example between in 0-1
Is it possible to bound somehow observation in a stable-baseline? For example between in 0-1
https://github.com/hill-a/stable-baselines/issues/1104
But your remark seems unrelated to the original issue...
You may also take a look at VecNormalize (cf doc).
Hi I am using PPO2 model with a custom env When the traing the model it can acheive a positive average value It can be seen both in the tensorboard episode_reward chart and also by monitoring the env (from stable_baselines.bench import Monitor) However, when passing the same dataset through the model the results are much worse... (average around zero) Any possible explanation? R, yuvaleck