facebookresearch / BenchMARL

A collection of MARL benchmarks based on TorchRL
https://benchmarl.readthedocs.io/
MIT License
292 stars 42 forks source link

Action filter development: issue with `self.action_mask_spec` #131

Closed Giovannibriglia closed 2 months ago

Giovannibriglia commented 2 months ago

Hi :),

I'm currently developing an action filter for your models. I’ve successfully implemented and tested the filter, and it works well. However, I encountered an issue with the self.action_mask_spec variable. Specifically, I noticed that self.action_mask_spec remains None in the following lines:

even though in my MLP's forward method, I handle the action_mask as follows:

...
tensordict.set(self.out_key, res)

actions_mask = # computation
tensordict['agents'].set('action_mask', actions_mask)

return tensordict

Am I missing any necessary configuration steps here?

matteobettini commented 2 months ago

Have you implemented the function to provide the action mask spec in the task? action_mask_spec is a TensorSpec (not a bool) and tells benchmarl the shape of your actin mask https://benchmarl.readthedocs.io/en/latest/generated/benchmarl.environments.Task.html#benchmarl.environments.Task.action_mask_spec

matteobettini commented 2 months ago

if you are modifying an existing vmas taks you can

def get_action_mask_spec(env):
   # return your spec

VmasTask. action_mask_spec = get_action_mask_spec
Giovannibriglia commented 2 months ago

if you are modifying an existing vmas taks you can

def get_action_mask_spec(env):
   # return your spec

VmasTask. action_mask_spec = get_action_mask_spec

I believe I am not modifying vmas tasks; what do you mean?

Giovannibriglia commented 2 months ago

Have you implemented the function to provide the action mask spec in the task? action_mask_spec is a TensorSpec (not a bool) and tells benchmarl the shape of your actin mask https://benchmarl.readthedocs.io/en/latest/generated/benchmarl.environments.Task.html#benchmarl.environments.Task.action_mask_spec

No, I didn't do it :/ I believed action_mask_spec could have only one shape, linked to the algorithm's action space; I was wrong.

matteobettini commented 2 months ago

So let me maybe give more context.

The Task class (which is also an enum), defines the spec that will be needed for the task. A Spec is a torchrl object that tells what to expect in tensordicts.

One of such methods (linked above) is for defining the spec of the action mask. This is normally a spec containing one entry with key (group,"action_mask") and value a BinarySpec that has the shape (n_agents,n_actions_in_task).

So the shape of the action mask spec is usually the same of the action spec (n_actions_in_task) and has nothing to do with the algorithm.

In your case, if you want to create a mask in the model and not in the env, you still need to provide the action spec in the Task class by implementing this function https://benchmarl.readthedocs.io/en/latest/generated/benchmarl.environments.Task.html#benchmarl.environments.Task.action_mask_spec

It will be sufficient to look at the action spec when creating (this will be different for different tasks) and then in your model you can do exaclty like you are already doing

Giovannibriglia commented 2 months ago

Thank you for this.

I struggle on undestanding where the new code has to be inserted, my fault :/

I defined the function like this:

def get_action_mask_spec(env: EnvBase) -> CompositeSpec:

    num_envs = ...
    num_agents = ...
    num_actions = ...

    action_mask_spec = BinaryDiscreteTensorSpec(
        n=2,  # binary: 0 or 1 for each action (not available vs available)
        shape=(num_envs, num_agents, num_actions)
    )

    # Create the CompositeSpec with the key ("group", "action_mask")
    composite_spec = CompositeSpec({
        ("group", "action_mask"): action_mask_spec
    })

    return composite_spec

Is it correct? Where this function has to be called? For example in benchmarl/run.py and/or examples/running/run_experiment.py

Thank you so much