FLAIROx / JaxMARL

Multi-Agent Reinforcement Learning with JAX
Apache License 2.0
436 stars 80 forks source link

Generalised STORM environment implementation. #114

Closed ali-shihab closed 2 months ago

ali-shihab commented 2 months ago

Generalised STORM implementation developed at FLAIR. Key changes:

  1. Allows for any number of agents to be specified.
  2. Allows for any grid size to be specified (though this needs to be greater than the number of objects contained in the grid, obviously).
  3. Allows for any number of coins (per type) to be specified.

Some plots of: 1. mean proportion of red/defect coins held by agents in the environment; 2. mean returns; during IPPO test runs are provided (population sizes of 256, 512, 1024). Note that in all cases, the (defection) ratio converges to approx 0.8, which is expected behaviour for IPPO.

1024_agent_returns 1024_agent_coin_ratio 512_agent_returns 512_agent_coin_ratio 256_agent_returns 256_agent_coin_ratio

Aidandos commented 2 months ago

can you also add a tutorial like this https://github.com/FLAIROx/JaxMARL/blob/main/jaxmarl/tutorials/storm_2p_introduction.py ?

ali-shihab commented 2 months ago

can you also add a tutorial like this https://github.com/FLAIROx/JaxMARL/blob/main/jaxmarl/tutorials/storm_2p_introduction.py ?

All done

Aidandos commented 2 months ago

Something seemed to have broken the tests with "InTheMatrix". Can you check what it is?

ali-shihab commented 2 months ago

Apologies - I forgot to update the imports & some parts of registration.py, hence the InTheMatrix import wasn't working. Should work now.

Aidandos commented 2 months ago

Can you also add a gif to JaxMarl/docs/imgs?

amacrutherford commented 2 months ago

great stuff :)