Closed Rocamonde closed 1 year ago
Hello,
thanks for the proposal =), it is indeed a common mistake.
or set([id(env) for env in self.envs])) using a list comprehension which is probably more pythonic.
Yes, please go ahead =) (I would be more verbose for the error message though)
Sounds good! Added a PR. Maybe it's worth adding a test for this, should be quite simple but I'm not familiar with your fixture system, do you think you could do that?
Sounds good! Added a PR. Maybe it's worth adding a test for this, should be quite simple but I'm not familiar with your fixture system, do you think you could do that?
I'm handling the tests
Cool sounds good!
Hey
I was doing the same, which apparently was not correct!
Thanks for updating it. But the problem is that now I have no idea how to run the right code.
Is there any kind of sample that could help me use make_vec_env
it. But the problem is that now I have no idea how to run the right code.
the error raised should give you some hints https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/vec_env/dummy_vec_env.py#L28-L35
Is there any kind of sample that could help me use make_vec_env
there are examples in the doc and in this issue, but if you don't give any code example, it's hard to help. Another way to (ugly) fix that issue is to use sub processes (but best would be to solve the root cause).
@Rocamonde Bless you for this issue and PR! This made me aware of some huge mistake I made when using this library.
Maybe other people have made the same mistake I did, so I want to leave this here:
environments = make_some_environments() # creates a list of environments
DummyVecEnv(
[
lambda: env for env in environments
]
)
This does not produce a list of lambda
s that each return one of the environments, but instead a list of lambda
s that all return the last environment in the list. See https://stackoverflow.com/a/34021333 for more details. To get the expected behavior do the following:
environments = make_some_environments() # creates a list of environments
DummyVecEnv(
[
(lambda env: lambda: env)(outer_env) for outer_env in environments
]
)
Or even better, do what is detailed in the issue.
Why is this code failing?
It's always returning a new object?
Why is this code failing?
It shouldn't. I can't reproduce the error you describe. If it persists, I suggest you open a dedicated new issue.
Why is this code failing?
It shouldn't. I can't reproduce the error you describe. If it persists, I suggest you open a dedicated new issue.
Yeah, I'm just trying to get it working, but I get exactly this value error introduced here. I'm currently trying to downgrade to run it.
@sebnapi did you ever create an issue to track this? I'm having the same issue and can't seem to find a way for the compiler to stop erroring:
def create_default_environment():
#1. Create base environemnt
newEnv = gym_super_mario_bros.make("SuperMarioBros-v0")
#2. Simplify controls
newEnv = JoypadSpace(newEnv, SIMPLE_MOVEMENT)
#3. Original state.shape is (240, 256, 3) where 240x256 are the dimensions and 3 is the color channels RGB
# keep_dim keeps one of the color channels to make (240, 256, 1) which is greyscale.
return GrayScaleObservation(newEnv, keep_dim=True)
env = DummyVecEnv([lambda: create_default_environment()])
The examples weren't very helpful and I have yet to find a concrete example of how to use this constructor.
🚀 Feature
TLDR: add a check to the result of env construction in
make_vec_env
that raises an error if all envs have the same memory ID. Happy to submit a PR for this.Motivation
Currently when one creates a vectorized environment using
make_vec_env
and passes a function to create an environment multiple times, there are no checks making sure that the function indeed returns a fresh environment every time.A priori it might seem that it's the user's responsibility to ensure this. However, if one is constructing an environment using hyperparameters and has a function that takes arguments to do this, e.g.
it is relatively easy to forget to wrap this as a closure. One should do:
or
but if one instead does
which is not something totally unreasonable to do at first sight, one will get totally undefined behavior where the same environment is being operated sequentially with actions that were calculated from multiple steps ago.
Same happens if you do
since it's fairly common to write
and to write
so mixing both up is not so unexpected. This has actually happened to me twice in the space of two months, and it took me several hours to realize of the errors. environments start behaving in totally bizarre ways, and it's hard to realize that it's not a policy training issue or a reward mis-specification issue unless you try to run with a single parallel environment and notice that it works fine.
Pitch
Raising an error if the same env instance is returned every time the env constructor is called is fairly straightforward and would avoid this type of error. I bet that no one would ever actually do this intentionally and it is almost surely always an error.
Just do in the DummyVecEnv initializer:
or
set([id(env) for env in self.envs]))
using a list comprehension which is probably more pythonic.Alternatives
No response
Additional context
No response
Checklist