Farama-Foundation / SuperSuit

A collection of wrappers for Gymnasium and PettingZoo environments (being merged into gymnasium.wrappers and pettingzoo.wrappers
Other
446 stars 56 forks source link

Fix for a) Seeding problems from gym env_checker b) infos API change #165

Closed jjshoots closed 1 year ago

jjshoots commented 2 years ago

There are two issues that the new gym release has brought to Supersuit.

a) Seeding issues

Because of the new env_checker in gym, when env_checker is enabled on gym.make() (which is done by default), the environment gets unintentionally seeded. As a result, any vec envs derived from single envs end up having the same seed even when the seed given to them was None. The current fix just reseeds any vec envs that are derived from single envs.

~~Although this works, I don't quite like this solution as it seems very hacky(?), though I may be educated otherwise. I think the better solution is for env_checker to reset the env's seed instead of having it be done outside, which is probably the expected behaviour of env_checker (checking without modifying).~~

b) Infos API change

Because of the recent infos API change on gym 0.24.0, the infos style for supersuit vec env wrappers are now no longer coherent to the infos style of gym vec envs.

If I understand things correctly, there are two wrappers that need to be fixed:

ConcatVecEnv

I've implemented a simple hacky fix for this, though I think my hacky fix is still pretty suboptimal and could take some ideas from people.

ProcConcatVecEnv

I've not touched this wrapper yet mostly because I don't really know what's going on with it yet.

jjshoots commented 2 years ago

@benblack769 @jkterry1 @RedTachyon @pseudo-rnd-thoughts I'd appreciate your feedback on this.

pseudo-rnd-thoughts commented 2 years ago

For the info tests, it looks like you just need to update the tests to reflect the updated info style

benblack769 commented 2 years ago

A couple comments:

1) I took a look at the env_checker, and didn't see anything that should affect seeding. Perhaps this is a broader issue that affects all environments? 2) I think the solution to seeds being the same is simply seeding with self.seed(None) inside __init__() in ConcatVecEnv and ProcConcatVecEnv

pseudo-rnd-thoughts commented 2 years ago

The sending issue will be removed in v25.

RedTachyon commented 2 years ago

@benblack769 The issue was e.g. at https://github.com/openai/gym/blob/master/gym/utils/env_checker.py#L72 But as @pseudo-rnd-thoughts says, it's currently being fixed.

Re 2: self.seed(...) is deprecated so depending on that is not the greatest idea. Nothing should be necessary to do when the env checker is fixed, but in case of a similar issue, it's probably best to just directly delete env._np_random

jjshoots commented 2 years ago

@benblack769 could you perhaps look at updating info styles for procconcatvecenv?

benblack769 commented 2 years ago

@jjshoots For the infos, you should be able to basically copy the approached used in gym. https://github.com/openai/gym/blob/master/gym/vector/vector_env.py#L238 https://github.com/openai/gym/blob/master/gym/vector/async_vector_env.py#L376 , basically just transforming the infos after the fact.

The apporpriate place to do this in proccatenvs would be here: https://github.com/Farama-Foundation/SuperSuit/blob/master/supersuit/vector/multiproc_vec.py#L202

To get reset() to return info, you will also need to extract the infos returned here https://github.com/Farama-Foundation/SuperSuit/blob/master/supersuit/vector/multiproc_vec.py#L179

jjshoots commented 2 years ago

Ok I give up. I have no idea what's going on, everything somehow depends on everything, not a single file has comments or docstrings or typehints, and there are cathartic lists of dicts of tuples of lists of stuff that's impossible to interpret because nothing has comments or is even remotely close to being explained.

If anyone wants to have a look and take a whack at it be my guest, otherwise it shall remain broken until we revamp all the wrappers and merge it into PZ.

benblack769 commented 2 years ago

@jjshoots Sorry about the lack of comments :( I don't have a lot of excuses for this.

As for these new changes, I don't really understand the new gym infos change either.

However, with my very poor understanding, I feel this is something that you can implement with a simple wrapper (maybe not perfectly efficiently, but functionally), without knowing anything about the implementation of the base vector environments other than their output format (the old list of dicts format)? Isn't it just a few data transformations and concatenations defined here?

jjshoots commented 2 years ago

Hey @benblack769, don't worry about it, it happens. :) Technically yes, it's just a transformation of the data, but as it stands I don't really understand how the code flow works (I haven't spent enough time with it probably), and the infos are quite densely packed (think lists of dicts of lists), which makes it hard to decipher. For now I think we may just leave this until we deprecate the old wrappers and move things into PettingZoo/Gym. Although if you're free to come on a call one day to explain what's happening, that's be great too.

WillDudley commented 1 year ago

https://github.com/Farama-Foundation/SuperSuit/issues/188

Related, I'm pretty sure.