Closed a240160572 closed 2 years ago
Hello,
None of them yields two identical training results.
This probably means that your env does not implement the seed()
method correctly, we have tests that actually check that for built-in gym env: https://github.com/DLR-RM/stable-baselines3/blob/master/tests/test_deterministic.py
Thank you for the quick reply. I checked my env with the test file. It is indeed not deterministic.
I am running a collision avoidance policy training. The starting point, goal, and obstacle positions are randomly generated for every rollout.
I tried to introduce the seed as a new signature to reset()
as def reset(self, seed: Optional[int] = None):
, like the example environment from gym. However, it replies reset() got an unexpected keyword argument 'seed'
. Perhaps it is the wrong way to do that.
Sb3 works with gym 0.21 (for now). The reset method in gym 0.21 doesn't take any argument.
In gym 0.22 and above, reset takes several args including the seed. You are probably referring to the doc of gym 0.24. Aren't you?
Sb3 works with gym 0.21 (for now). The reset method in gym 0.21 doesn't take any argument.
Ok, that makes sense then. I am referring to the code from https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py, which is with gym 0.24.
Ok, that makes sense then. I am referring to the code from https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py, which is with gym 0.24.
If you want to use that version, we have instructions in the documentation to install the associated PR (#780 ).
but for now, I would recommend implementing the seed()
method first (we have backward compat so your code won't break later when we merge the PR).
Closing as the issue comes from custom gym env and not SB3.
Closing as the issue comes from custom gym env and not SB3.
Sorry for that.
but for now, I would recommend implementing the
seed()
method first (we have backward compat so your code won't break later when we merge the PR).
May I ask for a hint about how to realize it with seed() in gym 0.21? Appreciate it.
I advise you to take inspiration from the implementation of the numerous environments integrated with gym 0.21: https://github.com/openai/gym/tree/c755d5c35a25ab118746e2ba885894ff66fb8c43
If you have difficulty doing this, I urge you to ask for help on this discord
I encountered the same issue, indeed env.seed() does nothing under the gym==0.21 and we should implement it ourselves as it was done in the past. Could be nice to write something about this in the documentation:) ? Thank you
Important Note: We do not do technical support, nor consulting and don't answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.
Question
Hey, I am working on an algorithm comparison for my custom environment. The environment generates a random scenario for every rollout. Therefore, training the algorithms based on the same data is preferred.
I have tested with
set_random_seed(seed = 1)
orenv.seed(seed = 1)
for the environment. Andmodel = A2C("MlpPolicy", env, seed = 1)
for the model.None of them yields two identical training results.
It is appreciated if you can show me how to set it properly.
Checklist