Open reginald-mclean opened 3 months ago
Few things I noticed:
_make_single_env()
has ML1-related code when _make_single_ml()
exists.AutoTerminateOnSuccess
wrapper but it should be toggled to the initial state specified by the terminate_on_success
flag as done here.use_one_hot
is set to False for all registered envs but it should be True for MT10 and MT50.seed
is added to the TaskSelect wrappers could probably cause some weird issues as it would reinitialise the global numpy rng state multiple times during env instantiation, and in general it's not necessary so it should probably be removed. The wrappers use the underlying env's np_random
anyway so they don't really need to be seeded. Maybe the seed passed into init_each_env
should just be used on the env directly:
env = env_cls()
if seed:
env.seed(seed)
Also I think for simplicity it should be possible to just have a single definition of init_each_env
that is used for both MT and ML that takes in env_cls
, tasks
, task_select_method
, maybe seed
, maybe max_episode_steps
, maybe use_one_hot
/ env_id
/ num_tasks
and has branching logic for OneHotWrapper and the task select method. MT envs just provide those one-hot parameters and use all tasks, while ML envs handle the task splitting a bit differently but they otherwise use the same wrappers and logic.
@rainx0r
It's not entirely clear to me why _make_single_env() has ML1-related code when _make_single_ml() exists.
Good catch, remnant of previous attempt at creating ML envs.
For MT envs, they should also have an AutoTerminateOnSuccess wrapper but it should be toggled to the initial state specified by the terminate_on_success flag as done
Will update
use_one_hot
is set to False for all registered envs but it should be True for MT10 and MT50.
I don't know if I agree with this. We can include the wrapper for completeness and show examples of enabling it, but I think we shouldn't influence users to use the wrapper by default.
Maybe the seed passed into
init_each_env
should just be used on the env directly
Agreed
Also I think for simplicity it should be possible to just have a single definition of
init_each_env
It would be possible but I think it might become a bit of a convoluted function to write/maintain. Unless there's a clean way of merging them, I think keeping them separate makes the most sense for maintenance reasons.
From experience, having tonnes of registered environment can make life easier but it can mean many more environments. For future proofing, an alternative approach is
We can keep, MetaWorld/env-name
, this is good.
Then for MT1
, an alternative is gym.make("MetaWorld/MT1", env_name="env_name", mode="train/test")
, this allows flexibility with env-name (to add more) and mode to easily specify if to train or test.
Similarly for MT10
, we can use env_names=[...]
Is there any reason for 1 and 10 only, could we make it MultiTask
with env_names
that is flexible to any number?
I'm purely spitballing ideas, you don't need to take any of them
@pseudo-rnd-thoughts I actually implemented something along the lines of what you suggested but forgot to include it. In addition to the above gym.make_vec commands there is also:
Both of the above commands gives the user control over the environments that they want to use in a multi-task or meta-RL setting, instead of the predefined ones.
I agree, it is a LOT of environments to add, but there are also lots of different use cases of MW environments. Some of them are single environments (ie 'Meta-World/reach-v3'), some of them are the smaller MT/ML environments (ie 'Meta-World/ML-train-reach-v3'), and some of them are the pre-defined environment sets (MT10, MT50, ML10, ML45).
@reginald-mclean In my opinion, I would work to keep the number of environments to a minimal.
Personally, I would only have gym.make
for the individual environments, i.e., gym.make("Meta-World/reach-v3")
Alongside the generic MultiTask
and MetaTask
single and vector environments.
Then finally have the original gym.make("MetaWorld/MT50")
to MT10, MT50, ML10, ML45
This reduces the mess you need to maintain and provides more opinions to the users on what they do. If a user wants a custom multi-task setup with env x, y, z then they can make it without a close but no equivalent version existing within the 100s of environment that could be registered.
Environment parameters are your friend here, minimising what you need to maintain while adding flexibility to users
I say this as I remove over 800 environment from Atari, currently there are over 1000 environments registered for only 100 games. For ALE, there are 14 environments registered for each game which in my opinion is crazy and very few people actually use the extra / special registered environments which can be accessed through parameters.
@pseudo-rnd-thoughts ok I have thought about it and I definitely agree with your comment about having minimal environments. I've moved the environments into their base environments (ie MT1, MT10, MT50), and allowed for the various different features via arguments. I've had a few extra commits slip into this PR, just trying to figure out how to remove them
This PR is built on top of #499 to add gym.make support for the environments in Meta-World. They are organized as follows:
And there are also gym.make_vec commands that return multiple environments wrapped in a sync or async wrapper: