Farama-Foundation / Metaworld

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
https://metaworld.farama.org/
MIT License
1.28k stars 274 forks source link

Add gym make support for Meta-World envs #498

Open reginald-mclean opened 3 months ago

reginald-mclean commented 3 months ago

This PR is built on top of #499 to add gym.make support for the environments in Meta-World. They are organized as follows:

And there are also gym.make_vec commands that return multiple environments wrapped in a sync or async wrapper:

rainx0r commented 3 months ago

Few things I noticed:

Also I think for simplicity it should be possible to just have a single definition of init_each_env that is used for both MT and ML that takes in env_cls, tasks, task_select_method, maybe seed, maybe max_episode_steps, maybe use_one_hot / env_id / num_tasks and has branching logic for OneHotWrapper and the task select method. MT envs just provide those one-hot parameters and use all tasks, while ML envs handle the task splitting a bit differently but they otherwise use the same wrappers and logic.

reginald-mclean commented 3 months ago

@rainx0r

It's not entirely clear to me why _make_single_env() has ML1-related code when _make_single_ml() exists.

Good catch, remnant of previous attempt at creating ML envs.

For MT envs, they should also have an AutoTerminateOnSuccess wrapper but it should be toggled to the initial state specified by the terminate_on_success flag as done

Will update

use_one_hot is set to False for all registered envs but it should be True for MT10 and MT50.

I don't know if I agree with this. We can include the wrapper for completeness and show examples of enabling it, but I think we shouldn't influence users to use the wrapper by default.

Maybe the seed passed into init_each_env should just be used on the env directly

Agreed

Also I think for simplicity it should be possible to just have a single definition of init_each_env

It would be possible but I think it might become a bit of a convoluted function to write/maintain. Unless there's a clean way of merging them, I think keeping them separate makes the most sense for maintenance reasons.

pseudo-rnd-thoughts commented 2 months ago

From experience, having tonnes of registered environment can make life easier but it can mean many more environments. For future proofing, an alternative approach is

We can keep, MetaWorld/env-name, this is good. Then for MT1, an alternative is gym.make("MetaWorld/MT1", env_name="env_name", mode="train/test"), this allows flexibility with env-name (to add more) and mode to easily specify if to train or test.

Similarly for MT10, we can use env_names=[...]

Is there any reason for 1 and 10 only, could we make it MultiTask with env_names that is flexible to any number?

I'm purely spitballing ideas, you don't need to take any of them

reginald-mclean commented 2 months ago

@pseudo-rnd-thoughts I actually implemented something along the lines of what you suggested but forgot to include it. In addition to the above gym.make_vec commands there is also:

Both of the above commands gives the user control over the environments that they want to use in a multi-task or meta-RL setting, instead of the predefined ones.

I agree, it is a LOT of environments to add, but there are also lots of different use cases of MW environments. Some of them are single environments (ie 'Meta-World/reach-v3'), some of them are the smaller MT/ML environments (ie 'Meta-World/ML-train-reach-v3'), and some of them are the pre-defined environment sets (MT10, MT50, ML10, ML45).

pseudo-rnd-thoughts commented 2 months ago

@reginald-mclean In my opinion, I would work to keep the number of environments to a minimal. Personally, I would only have gym.make for the individual environments, i.e., gym.make("Meta-World/reach-v3") Alongside the generic MultiTask and MetaTask single and vector environments. Then finally have the original gym.make("MetaWorld/MT50") to MT10, MT50, ML10, ML45

This reduces the mess you need to maintain and provides more opinions to the users on what they do. If a user wants a custom multi-task setup with env x, y, z then they can make it without a close but no equivalent version existing within the 100s of environment that could be registered.

Environment parameters are your friend here, minimising what you need to maintain while adding flexibility to users

I say this as I remove over 800 environment from Atari, currently there are over 1000 environments registered for only 100 games. For ALE, there are 14 environments registered for each game which in my opinion is crazy and very few people actually use the extra / special registered environments which can be accessed through parameters.

reginald-mclean commented 3 weeks ago

@pseudo-rnd-thoughts ok I have thought about it and I definitely agree with your comment about having minimal environments. I've moved the environments into their base environments (ie MT1, MT10, MT50), and allowed for the various different features via arguments. I've had a few extra commits slip into this PR, just trying to figure out how to remove them