Closed anthony0727 closed 1 year ago
Hi @anthony0727, thanks for starting the discussion on this! The main value we get from gym (<=v0.21) is the familiar API for users, and the type inheritance for plugging in libraries like rllib. We already have a bunch of extensions on top of the base API (e.g. the datasets API, custom spaces, the extra args to env.step()
and env.reset()
) which makes the coupling quite loose.
I think we have a few options:
I would love to get as much input from our users on this as possible. Personally, I would lean toward option (1) or (3), though I think there is a good case for (2) to be made as well.
Cheers, Chris
I agree with the adaptor pattern! discussed with rllib authors and they would likely go with (3) too!
We would just need to implement adaptor whenever API changes.
I'll leave this thread open until Nov. and close if it's okay!
Thanks, Anthony
Hi @anthony0727, okay that sounds good, thanks for getting back to me. Is there an issue on the rllib repo to track their migration? I may follow their lead.
No need to close this. I'll use it to track the changes and continue discussion 🙂
Cheers, Chris
There's no specific issue thread for this, but
Looking at commit history from https://github.com/ray-project/ray/commits/releases/2.2.0/rllib
seems like it would take some time
This is chat from @sven1977
We are in the process of updating RLlib to be fully compliant with the new gymnasium (or gym>=0.26) APIs. This will include an updated RLlib pettingzoo adapter. It’s a massive undertaking but will hopefully make it into Ray 2.2 (fingers crossed that we get all failing tests to pass by then :slightly_smiling_face: )
Additionally, I'm getting back to compiler optimization with Programl,
and I remember async env worked seamlessly,
(though I encountered some unable to pickle lock object issue with cbench
regarding https://github.com/facebookresearch/CompilerGym/issues/304)
Is there something I should be aware of? from your reply
Would unlock more aggressive changes to CompilerGym APIs for things like performance (e.g. async / batched envs).
My env would be async-ed from this env
def register_all(config, model_cls=None):
def make_env(dummy):
env = compiler_gym.make('llvm-ic-v0', **config.env_config)
train_benchmarks = []
for i in ['cbench-v1', 'npb-v0', 'chstone-v0']:
train_benchmarks.extend(
list(env.datasets[i].benchmark_uris())
)
env = CommandlineWithTerminalAction(env)
env = RandomOrderBenchmarks(env, train_benchmarks)
env = RllibWrapper(env)
# env = TimeLimit(env, max_episode_steps=128)
return env
register_env(
ENV_ID,
make_env
)
Thanks! Anthony
Hi Anthony,
Thanks for the extra pointers. I'm in no rush to make preemptive changes, especially if those are compatibility breaking, so I'll keep an eye on rllib developments.
Is there something I should be aware of?
No, you'll be fine with the code snipped you posted. What I meant was that for things like vectorized environments or async environments, we may be able to more efficiently implement them as extensions to the C++ layer and then expose the functionality through new APIs. This wouldn't break any existing code.
Cheers, Chris
@elliottower confirmed that rllib has migrated to gymnasium so I think this makes a good case for option (2) above. I'll close this in favor of #789 as the issue tracker for the migration.
Cheers, Chris
❓ Questions and Help
Should we decouple CompilerGym from (past) openai/gym?
Additional Context
I assume you are already aware of it, gym's maintainer has been making breaking API changes.
https://github.com/Farama-Foundation/Gymnasium/blob/1956e648f951eb64c45440997f8fe484ef3c7138/gymnasium/core.py#L71
I reckon the changes are quite reasonable, but previous API(returning observation, reward, done, info) served as quite rigorous protocol for reinforcement learning.
Is there already on-going discussion regarding this?
TL;DR Most inheritance to CompilerGym from openai/gym was for typing, rather than it's functionality - openai/gym doesn't really have much functionality itself. So we can easily define our own
Space
,Env
, etcHowever, if we decouple, many RL libraries depend on openai/gym and might conflict
ray[rllib]
do strong type checking such asisinstance(obs_space, gym.space.Space)
I hope we(CompilerGym users) do discussion here to help FAIR team.