MeltingPot Envs with MarlGroupMapType.ONE_GROUP_PER_AGENT

gliese876b commented 10 hours ago

I would like to run an experiment with independent learning on a heterogeneous setting where agents get individual and different rewards.

The default setting for group_map for MeltingPot envs is ALL_IN_ONE_GROUP.

When I switch it to ONE_GROUP_PER_AGENT (as suggested by TorchRL's documentation), an error occurs from ‎MeltingPotTask.get_replay_buffer_transforms() as the method creates a transformation with all keys and the transformation is applied to each player's TensorDict where only that player's keys exists. For example, a transformation with keys of "player_0" and "player_1" is being applied to a TensorDict that contains only a key for "player_0".

For the moment, I disabled it.

I think this issue could be solved by making the replay buffer transformations only if the key exists, similar to ExcludeTransform.

matteobettini commented 9 hours ago

Hey thanks for openening this!

I think I got the issue, let me fix it!

matteobettini commented 9 hours ago

Can you check that #148 solves your issue?

gliese876b commented 8 hours ago

Yeap! That works. Thanks!

matteobettini commented 8 hours ago

Cool! right now torchrl main is broken, I'll marge it as soon as i can!

facebookresearch / BenchMARL

MeltingPot Envs with MarlGroupMapType.ONE_GROUP_PER_AGENT #147