openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.81k stars 8.61k forks source link

Plans for Future Maintenance of Gym #2259

Closed jkterry1 closed 3 years ago

jkterry1 commented 3 years ago

So OpenAI made me a maintainer of Gym. This means that all the installation issues will be fixed, the multi-year backlog of PRs will be resolved, and in general Gym will now be reasonably maintained.

I, or specific people I know, plan to do the following:

I am actively soliciting contributions for the following:

I am not interested in contributions for the following:

There are certain things that I would like to do, but I cannot immediately do them for various reasons (this may change in the future):

It is also important to note that beyond the described pieces of maintenance and merging PRs, I physically do not have the time to add new features or fix bugs myself right now. I also will likely not be able to generally respond to issues do to the fairly high number, though I highly encourage those in the community to do so and would be very appreciative of this. This is not because I do not want to, this is because I am only one person and have a tremendous number of other professional obligations in my life. In other words- if you want a feature added or a bug fixed, please submit a PR, and please submit the PR with a sufficiently detailed description that I can figure out what's happening in it at 2am when my brain is fried.

Edit: An astonishing number of people have messaged me about open source donations, so I created links. This money goes to me directly, and has no affiliation or connection to OpenAI (who does not pay or employ me).

https://liberapay.com/jkterry https://www.buymeacoffee.com/jkterry

vwxyzjn commented 3 years ago

This is super exciting stuff. Thanks so much for maintaining @jkterry1!

tristandeleu commented 3 years ago

First of all this is great to have you as a maintainer @jkterry1, and looking forward to what Gym will become. I will happily reopen my stale PRs.

It's also great to see the documentation being among the high priority items. This is long overdue, especially since some features such as the VectorEnv are totally undocumented, and this has lead to trust issues https://github.com/DLR-RM/stable-baselines3/issues/1#issuecomment-627321747 (rightfully so from the point of view of SB3, although this is a well-tested feature by now).

However regarding this change (which I understand is lower priority)

  • Remove the RAM observation based Atari environments because literally no one uses them

I am strongly in favor of keeping this feature. This does not seem to be a breaking feature, and it does not require any maintenance at all (the majority of which is done on the atari-py side, not on the Gym side). It is very dangerous to remove a feature like this, no matter how unlikely people may use it.

Edit: I forgot to mention that earlier, but saying that "literally no one uses them" is also not true. See for example the AtariARI environment at NeurIPS 2019.

jkterry1 commented 3 years ago

Fair enough; I can keep RAM observations as an argument if it comes to that.

ZachariahRosenberg commented 3 years ago

@jkterry1 I can certainly help with adapting the documentation to the new site.

If I recall, there were two sites for documentation:

https://gym.openai.com/docs/ https://github.com/openai/gym/wiki

I presume these would be consolidated and updated? Any additions?

You mention a current designer creating a Jekyll theme for this project. I've worked with Jekyll before and am happy to use that. Alternatively, I'm comfortable with any frontend template or code we want to use.

@jkterry1 Who would you suggest I coordinate with?

lebrice commented 3 years ago

Great plan! :+1: I have a suggestion: Do you think we could turn on GitHub's "Discussions" feature for this repo? I feel like we could perhaps get some wider discussions going, and split things off into different topics.

jkterry1 commented 3 years ago

@ZachariahRosenberg I'm very happy to hear that, please email me at jkterry@umd.edu

@lebrice I don't have the relevant permissions

SJCaldwell commented 3 years ago

@ openAI can you please give @jkterry1 the relevant permissions lest we be forced to make a discord

jkterry1 commented 3 years ago

We've just been using the general purpose RL discord: https://discord.gg/amy9KW4p. There's actually a private Gym maintainers channel.

agentydragon commented 3 years ago

Really good that this piece of common infra will get actively maintained again. Thanks for taking on the commitment <3

As a useful but easy task, maybe you could release a new version? The current version has https://github.com/openai/gym/issues/1925 still open, though the fix in https://github.com/openai/gym/pull/2151 is already in master.

jkterry1 commented 3 years ago

I'll release a nee release after CI is fixed, all the old PRs are sorted out and ideally once ALE-Py is merged. It'll probably take a week.

On Wed, Jul 28, 2021 at 3:33 AM Rai @.***> wrote:

Really good that this piece of common infra will get actively maintained again. Thanks for taking on the commitment <3

As a useful but easy task, maybe you could release a new version? The current version has #1925 https://github.com/openai/gym/issues/1925 still open, though the fix in #2151 https://github.com/openai/gym/pull/2151 is already in master.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openai/gym/issues/2259#issuecomment-888084452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEUF33FWEJYOXP6ICVH54ZDTZ6XGNANCNFSM5BCWLALQ .

-- Thank you for your time, Justin Terry

araffin commented 3 years ago

This means that all the installation issues will be fixed, the multi-year backlog of PRs will be resolved, and in general Gym will now be reasonably maintained.

Good to hear =) (even though I wish OpenAI would have done that earlier)

While they will continue to work for the foreseeable future, we strongly recommend using SuperSuit instead:

Why such choice and not the opposite? (adding dependencies is something that needs to be carefully discussed, related to https://github.com/DLR-RM/stable-baselines3/issues/524)

EDIT: while reading the README, I stumbled upong "New release notes are being moved to releases page on GitHub", why not adding a CHANGELOG.md that is progressively modified?

tristandeleu commented 3 years ago

While they will continue to work for the foreseeable future, we strongly recommend using SuperSuit instead:

Why such choice and not the opposite? (adding dependencies is something that needs to be carefully discussed, related to DLR-RM/stable-baselines3#524)

I completely agree with @araffin. What is the goal of deprecating features (3133e99a64be1a9b794bf62cc3e5fb8c8abd21b9) that are already there in Gym in favor of a third-party library? This feels convoluted and, as far as I understand (I might be wrong, I'm not that familiar with SuperSuit), SuperSuit is missing features that Gym already has, e.g. all the components of the standard pre-processing for Atari, such as no-op reset and terminal on loss of life, while Gym has a all-in-one wrapper AtariPreprocessing already.

What features does SuperSuit have that Gym does not (currently), that would justify pushing users to install a third-party library? How hard would it be to move them to Gym?

araffin commented 3 years ago

One additional note: I understand and share your enthusiasm about finally having Gym maintained again, but I would highly recommend you to take it slow and make careful decisions about any modification (that is not documentation or fixing typos) to avoid bloating or breaking the package.

cclauss commented 3 years ago

What is the goal of deprecating features that are already there in Gym in favor of a third-party library?

The values of out-sourcing are broadly understood. Avoiding duplication of effort, focusing on real differentiation, moving complexity behind clearly defined interfaces, making the architecture easier for contributors & maintainers to understand.

jkterry1 commented 3 years ago

So for SuperSuit, like I keep saying, this will not break existing code using the internal wrappers. Too much depends on that. The SuperSuit API is such that using SuperSuit should also basically be drop in for existing Gym functionality. Moreover, SuperSuit has no onerous dependencies. It's a small and lightweight pure Python package. Gym would also not actually depend on SuperSuit in setup.py for installation.

There are several reasons why I'd like SuperSuit to be the mainstream solution: -PettingZoo is objectively the standard multi-agent RL API now, independent of anything involving me. It's the third most installed RL library on PyPI now, every MARL library supports it but Tianshou (where the addition is ongoing), and there's 20+ third party environments and many under development. Even Neural MMO is being moved over to it, without any involvement of encouragement on my part. SuperSuit wrappers share a bunch of logic for PettingZoo and Gym environments, reducing code duplication and making things more feasibly maintainable. -This is a more minor point, but on a philosophical level I'd argue having this be in it's own library makes more sense for the same reason that say RL baselines 3 and SB3 are separate libraries or the JAX ecosystem is made up of many libraries. -SuperSuit currently has very well factored, documented, modular and clean code with largely comprehensive good human readable error messages for bad inputs etc. and it has a bunch more useful wrappers over Gym. While everything SuperSuit has could be ported to Gym, introducing many important features SuperSuit has (e.g. versioning for reproducibility or actually useful lambda wrappers) would inherently require making breaking changes to Gym. However, this would actually take a very large amount of work. E.g. it took Ben Black more than a month on and off just to refactor SuperSuit into its current state, and unlike me he's actually an extraordinarily talented programmer.

In other words, upstreaming SuperSuit's functionality to Gym would create breaking changes to Gym's wrappers (which I don't want to do), it would result in a bunch of code duplication that would create maintenance overhead, and it would be a very large software engineering effort to do so (despite it seeming like a superficially simple one). And again, SuperSuits external API is basically the same as Gym's wrapper API and it's a lightweight pure Python library with no problematic dependencies. I essentially am of the opinion that switching to SuperSuit is the most conservative course of action here given all of this.

However, @tristandeleu is correct that SuperSuit does not have all the features Gym does on re-review. In addition to the two you pointed out, SuperSuit does not have the pixel observation, dictionary filtering, time aware observations wrappers or episode statistic recording wrappers and SuperSuit's frame stacking wrapper does not have lazy frame stacking.

I'll definitely need to add lazy frame stacking to SuperSuit. Beyond that, I'm currently of the opinion that the 3 missing wrappers I just mentioned are best left as things people can use lambda wrappers for because I managed to become the maintainer of Gym without knowing they even existed (though I could be persuaded otherwise if I'm missing something here). Regarding no-op reset and terminal on loss of life, per discussion with Ben Black having those as wrappers essentially looks like hacks to us? Learning code does not require wrappers to be able to properly function, and the old wrappers will be left in Gym for compatibility purposes with existing code basis that depend on having that logic on those wrappers to function. I'd be willing to add these to SuperSuit if there's an important reason to have this logic in wrappers that's I'm missing though.

jkterry1 commented 3 years ago

@araffin

Regarding the changelog, if there's some sort of reason to have changes there instead of with the GitHub releases I could? Putting the changelog in the readme was becoming excessive and it just had to go somewhere. Using the GitHub releases feature is fairly standard across open source projects and it's my personal preference is all.

Regarding "I would highly recommend you to take it slow and make careful decisions about any modification (that is not documentation or fixing typos) to avoid bloating or breaking the package." is there anything in particular you're thinking of here that I'm missing? This sounds like my expressed goal as well. The only things I intend to break are what I stated- the vector environment API, because all the RL library developers have a bunch of problems with it and recommending developers use different preprocessing wrappers.

tristandeleu commented 3 years ago

In other words, upstreaming SuperSuit's functionality to Gym would create breaking changes to Gym's wrappers (which I don't want to do), it would result in a bunch of code duplication that would create maintenance overhead, and it would be a very large software engineering effort to do so (despite it seeming like a superficially simple one).

I do not understand this, what are those breaking changes that you have in mind? Is it to maintain wrappers compatible for both Gym and Petting-Zoo? If so, this compatibility should be enforced by Petting-Zoo/SuperSuit, and should not require changes to Gym; putting it differently, the users of Gym that have always relied on those wrappers should not have to deal with these deprecations because of incompatible APIs in third-party libraries.

In order to get a better understanding, I made this comparison table (limited to the wrappers that can be applied to Gym environments, not the multi-agent ones).

SuperSuit (https://github.com/PettingZoo-Team/SuperSuit/commit/d43fcb7675f35a27f43b12eaaf02df963afdf454) Gym (e9d2c41f2b233864c59cd9f2d9240f40e14da8b9)
clip_reward_v0 #1484 (TransformReward)
clip_actions_v0 ClipAction
color_reduction_v0 GrayScaleObservation
dtype_v0 -
flatten_v0 FlattenObservation
frame_skip_v0 AtariPreprocessing
delay_observations_v0 -
sticky_actions_v0 -
frame_stack_v1 FrameStack
max_observation_v0 AtariPreprocessing
normalize_obs_v0 https://github.com/openai/gym/pull/1635
reshape_v0 -
resize_v0 ResizeObservation
nan_noop_v0 -
nan_zeros_v0 -
nan_random_v0 -
scale_actions_v0 RescaleAction
action_lambda_v1 -
observation_lambda_v0 TransformObservation
reward_lambda_v0 TransformReward

As you can see, a majority of the wrappers available in SuperSuit are already there in Gym, and the ones that are not can be easily added, or can be left as lambda wrappers as you said.

If I understand correctly, the maintenance overhead seems to come from SuperSuit being compatible with both Gym and Petting-Zoo, and trying to catch up with the Gym wrappers rather than the other way around: wrappers for Gym environments were added in https://github.com/PettingZoo-Team/SuperSuit/pull/2 in SuperSuit, while at this point all of these wrappers above (and more) had been added to Gym already by @zuoxingdong (prior to SuperSuit being created).

From a UX point of view, I also feel like Gym wrappers as they stand right now are nicer to work with than SuperSuit. For example, if I want to get started with Atari with standard pre-processing, this is readily available in Gym:

import gym
from gym.wrappers import AtariPreprocessing

env = gym.make('BreakoutNoFrameskip-v4')
env = AtariPreprocessing(env)

Whereas with SuperSuit:

import gym
from supersuit import color_reduction_v0, frame_skip_v1, resize_v0
# Eventually: from supersuit import no_ops_v0

env = gym.make('BreakoutNoFrameskip-v4')
env = no_ops_v0(env, 30)
env = frame_skip_v1(env)
env = resize_v0(env, 84, 84)
env = color_reduction_v0(env)

And again, SuperSuits external API is basically the same as Gym's wrapper API and it's a lightweight pure Python library with no problematic dependencies.

Currently, SuperSuit has a dependency on Petting-Zoo, which means that in order to get standard wrappers for Gym (e.g. the ones for Atari for example), then you'd need to install Petting-Zoo, which is probably unnecessary for any non-multi-agent project. This could be avoided by having an extra_requires on SuperSuit's side, but this is yet another change to make to SuperSuit (in addition to the missing wrappers).

I would understand the move to go for SuperSuit instead of Gym if SuperSuit was overwhelmingly used by the community over Gym, but that does not seem to be the case: for example, comparing Gym's AtariPreprocessing to SuperSuit's frame_skip_v0. Overall in my opinion, deprecating the wrappers in Gym in favor of SuperSuit is not a good move, but I'd be happy to hear what others think about this.

The only things I intend to break are what I stated- the vector environment API, because all the RL library developers have a bunch of problems with it and recommending developers use different preprocessing wrappers.

The vector environment API can definitely be improved! However I think that it could benefit the most from documentation first, to show that this API exists in the first place, rather than breaking it. I also hope that if and when the time comes to break this API, you will keep the community in the loop first before making those changes.

jkterry1 commented 3 years ago

-"I do not understand this, what are those breaking changes that you have in mind?" Wrapper versioning for reproducibility and fully featured lambda wrappers are the two main ones -That is not where the maintenance overhead comes from and I'm confused why you think that? -I philosophically disagree with you on the UX perspective.

So far no one other than you has raised major concerns about SuperSuit like has happened with other proposed changes (e.g. removing the Atari RAM wrappers). My plan for now is to privately discuss this with other maintainers of some the major open source DRL libraries and go from there. Your opinion will be taken seriously.

araffin commented 3 years ago

There are several reasons why I'd like SuperSuit to be the mainstream solution: PettingZoo is objectively the standard multi-agent RL API

If I understand correctly, the maintenance overhead seems to come from SuperSuit being compatible with both Gym and Petting-Zoo,

Gym is used by a lot of people (including me) who depend on its API and regularly extend those wrappers directly to suit their needs.

I'm not sure why PettingZoo should be involved there, as Gym is the main/most used package and other packages should rely on it (and not the other way around).

it's own library makes more sense for the same reason that say RL baselines 3 and SB3 are separate libraries

(RL baselines is not a library but a training framework (training/plotting scripts) that uses SB3 library) Gym wrappers are already in gym, they mostly missing a bit more care. And here, I think providing envs + assocated wrappers in once place makes sense.

versioning for reproducibility

this is what software version + proper changelog is for.

E.g. it took Ben Black more than a month on and off just to refactor SuperSuit into its current state, and unlike me he's actually an extraordinarily talented programmer.

I was not talking about making Gym wrappers look like SuperSuit but rather add the missing features/documentation that you consider very useful to the current wrappers (in fact, most gym wrappers are documented, mostly missing an autogenerated doc website with sphinx).

What important features do you consider missing from current wrappers? (to estimate the work needed) checking inputs? (for versioning, I wrote a short answer above) I'm pretty sure you could find people to help you on that (as @zuoxingdong already did).

In order to get a better understanding, I made this comparison table (limited to the wrappers that can be applied to Gym environments, not the multi-agent ones).

thanks for the table @tristandeleu =)

From a UX point of view, I also feel like Gym wrappers as they stand right now are nicer to work with than SuperSuit. -I philosophically disagree with you on the UX perspective.

Well, I would say that both work the same way to me: you do env = wrapper(env) everytime you want to transform/do something.

So far no one other than you has raised major concerns about SuperSuit

I already count three people ;)

My plan for now is to privately discuss this with other maintainers of some the major open source DRL libraries and go from there.

I would rather keep the discussion public for transparency.

==============

Regarding the changelog, if there's some sort of reason to have changes there instead of with the GitHub releases I could?

yes, as you don't do a release at every commit, it helps users that are based on master version to keep track of changes (or also helps when you are offline). It would also allow to have it in the documentation. From a maintainer point of view, it makes things easier when drafting a release: you just need to copy-paste the changelog entry to Github release and don't have to go through all what have been done between two releases.

This sounds like my expressed goal as well.

Good to hear. I was mostly worried when you merged all that PRs and started changing readme + add deprecation in only two days.

vwxyzjn commented 3 years ago

Want to again say I really appreciate you work in maintaining gym and the amount of effort that was put into supersuit, but I agree that the existing wrappers should not be depreciated.

In addition to the Atari wrappers that @tristandeleu mentioned, there are other super helpful wrappers such as gym.wrappers.RecordEpisodeStatistics(env) and gym.wrappers.Monitor(env); I use these two wrappers all the time in my home-brew cleanrl repository. There is also the TimeLimit wrapper that 90% of the gym environments rely on.

Philosophically, I agree with @araffin and prefer the foundational utilities to come from the same repo. I think it creates a better experience. When you see the documentation for the environment, you also see that related wrappers in the same documentation versus opening documentations of different repos.

araffin commented 3 years ago

openai deleted a comment from semitable 19 minutes ago

why is that? Looking at the comment from @semitable (received by email), it does not contain anything that would justify that (no insult, no off-topic reply, mostly expresses his point of view).

Note: this deletion should not be the topic of this issue but a short justification of it would be appreciated

ZachariahRosenberg commented 3 years ago

I think the discussions around PettingZoo and SuperSuit are and should be considered irrelevant. This is Gym. Decisions about the future of Gym should only consider the trajectory of Gym and not any other dependent/externality.

Change Log

yes, as you don't do a release at every commit, it helps users that are based on master version to keep track of changes (or also helps when you are offline). It would also allow to have it in the documentation. From a maintainer point of view, it makes things easier when drafting a release: you just need to copy-paste the changelog entry to Github release and don't have to go through all what have been done between two releases.

My understanding of Github releases is that it already bundles several commits with a provided narrative of the changes. What would the changelog have in addition to this?

@araffin, can you provide an example of an open source project you think does this well? I certainly agree with you that maintaining a clear paper-trail for what has changed between releases is really important.

Wrappers

As I stated in the beginning of this post, I don't believe decisions regarding gym should be based on the availability of an alternative, regardless of how viable and terrific it may be.

Further, the wrappers need to be maintained. The problem with the word, "wrapper", is it makes the concept sound like it's just a nice add-on. But anyone working with gym knows full well that these are essential functions. They should be considered part of the core of gym.

However, as @cclauss concisely put it, the question is whether gym and its gym.wrappers have the same development trajectory or would they benefit from having a bit of room. To me, it feels like the current wrappers weren't really planned out, but instead, were just thrown in. They could use some dedicated attention. The question is, would they be easier to maintain and update if they had some room of their own. For instance, you can have an "engine" separate from the interface. The interface can be based on a specific engine version, which allows both to develop fully without weighing each down.

I'd assume, maybe incorrectly, that wrappers will have different dependencies than the gym engine. We wouldn't want breaks in one to hold up development in the other.

@tristandeleu @araffin, as you pointed out, the wrappers are a pretty core part of interfacing with the gym. They should probably get more care and attention.

jkterry1 commented 3 years ago

I greatly appreciate everyone's comments here. I'm going to find another solution to addressing my concerns than deprecating Gym's wrappers for SuperSuit and create a new post about it when I have a plan. I'm locking this post because the comments have derailed from the original goal of this thread, please feel free to create new issues for things as they arise. I'll continue to discuss plans regarding the future of wrappers in the gym maintainers chat on discord before offering a new plan up to the public for comment.

jkterry1 commented 3 years ago

The wrapper deprecation will be reverted in this PR: https://github.com/openai/gym/pull/2273