instadeepai / jumanji

🕹️ A diverse suite of scalable reinforcement learning environments in JAX
https://instadeepai.github.io/jumanji
Apache License 2.0
645 stars 80 forks source link

Create a naming convention for Jumanji envs #248

Open WiemKhlifi opened 1 month ago

WiemKhlifi commented 1 month ago

Is your feature request related to a problem? Please describe

Configuring environments in Jumanji involves manually setting different parameters such as grid size, number of agents etc. This manual setup is not only time consuming but complicates the integration of Jumanji envs with RL frameworks like MAVA. Each new scenario addition currently requires a distinct YAML configuration, as seen here in the MAVA repo.

Describe the solution you'd like

I propose implementing a standardised naming convention for Jumanji envs that succinctly encodes all necessary parameters into the environment's name. This approach would mimic the simplicity of classical Gym envs. For example:

# Initialize the Level-Based Foraging environment with concise identifiers.
env = gym.make("Foraging-2s-8x8-2p-2f-coop-v3")

A function to extract and apply this naming convention could look like this:

def get_lbf_config(scenario):
    # Example format: "2s-10x10-3p-3f-coop"
    parts = scenario.split('-')
    grid_size = int(parts[1].split('x')[0])
    fov = int(parts[0].rstrip('s')) if 's' in parts[0] else grid_size
    num_agents = int(parts[2].rstrip('p'))
    num_food = int(parts[3].rstrip('f'))
    force_coop = 'coop' in parts

    return {
        "grid_size": grid_size,
        "fov": fov,
        "num_agents": num_agents,
        "num_food": num_food,
        "force_coop": force_coop,
        "max_agent_level": 2,  # Set as default
    }

Describe alternatives you've considered

An alternative could involve establishing a registration system akin to Gym's, where we can register environments with predefined attributes:

from gymnasium import register

# Example permutations for environment registration
for s, p, f, coop, po in itertools.product(
    range(5, 20), range(2, 10), range(1, 10), [True, False], [True, False]
):
    register(
        id=f"Foraging-{'2s' if po else ''}{s}x{s}-{p}p-{f}f{'-coop' if coop else ''}-v3",
        entry_point="lbforaging.foraging:ForagingEnv",
        kwargs={
            "field_size": (s, s),
            "num_agents": p,
            "num_food": f,
            "sight": 2 if po else s,
            "force_coop": coop,
            "max_episode_steps": 50,
            "grid_observation": not po,
        },
    )

Misc

sash-a commented 3 weeks ago

I think this would be very nice, but how would jumanji.make pick it up if it's not registered? I think make would then need to call this method, register the env if it's valid and then make the env? I'd be interested to a POC for this if you have some capacity to implement it?

WiemKhlifi commented 2 weeks ago

I'd like to make a POC probably in the upcoming weeks ( I'll make progress based on the other tasks), but we can make something like the ones in matrax here using jumanji register which is similar to what gymnasium do and cleaner than what first method do. But still, the first method quicker and doesn't need to store anything like a register. Probably a POC using the Jumanji register to check the feasibility of this (second method) and ensure we don't have the same issues we had with the gym. What do you think?

sash-a commented 2 days ago

Sounds good :smile: