RobertTLange / gymnax

RL Environments in JAX 🌍
Apache License 2.0
577 stars 54 forks source link

[Proposal] Gym conversion wrappers #33

Closed DavidSlayback closed 1 year ago

DavidSlayback commented 1 year ago

Would you be interested in a PR with wrappers to convert an Environment instance into Gym and VectorGym instances? This would be similar to how the Brax wrappers work (I helped write that PR), where the Gym environment keeps track of rng/state and vectorizes the underlying environment. I have rough implementations already and would be interested in contributing

RobertTLange commented 1 year ago

Great recommendation. I have also been thinking about adding a set of wrappers (see #29) for conversion and minimizing the input specifications (e.g. hide explicit rng/env_params control if not desired). Again, I think this is a great point! Please feel free to open a PR. And if you are interested, we can also have a quick call to discuss!

DavidSlayback commented 1 year ago

I'm definitely interested, would be worth having a quick call when you're free! I'm trying to piece together various inspirations from this repo, brax, and the new proposal for a functional gym API into something that:

1) Works with compilation and auto-batching (JAX/functorch vmap(), numba vectorize() and jit) 2) Can be used for multi-agent environments (PettingZoo interface) and POMDPs 3) Can work with commonly-used stateful wrappers like RecordEpisodeStatistics and across-batch normalization 4) At least has an interface option for Gym/PettingZoo

Obviously this is a lot to ask, but I've been dealing with these issues long enough in my own work to want to contribute whatever I can to fixing it. I'll send an email. In the meantime, I'll try to take a minimized set of commits from my work for just the POMDP-compatible Environment and Gym conversion wrappers and submit it as a PR

RobertTLange commented 1 year ago

@DavidSlayback can I close this? I think you addressed most of the working points, right? Thank you again for your communication. P.S.: I am back now and will spend the next weeks developing ;)

DavidSlayback commented 1 year ago

Oh, sorry! Yeah, I'll close!