google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
Apache License 2.0
10.42k stars 1.36k forks source link

[question] Applying Dopamine to MultiAgent Environments #175

Closed rfali closed 3 years ago

rfali commented 3 years ago

I like the compact implementations of Dopamine, and was looking to apply it to the next step to ALE, which is the multi-agent ALE from PettingZoo. As far as I have searched on this issue, Dopamine does not support multiple agents as per #23. However, the Hanabi Learning Environment [paper] [code] [post] attempts to solve a multiagent problem, and the Rainbow Agent provided is also based on Dopamine. The problem is that I am not entirely sure that multi-agent environments that use vectorized environments, where each agent is in a separate environment on its own but all agents share the observation space, can use Dopamine's Rainbow Agents off-the-shelf.

My question is:

  1. If I want to adapt Dopamine's Agents to use in a Petting Zoo environment, should I look towards Hanabi's Rainbow Agent?
  2. If not, what changes might be required in Dopamine to make it work in a multi-agent setting? I can attempt to do this if the contributors can provide some guideline. A reference issue is perhaps #110 (as mentioned, using one environment makes it slow. Libraries like Stable Baselines or RLlib which support vector environments are supposedly much faster).

As a side note, I have gone through this and the associated colab but not sure if this is going down the same path as I want.

Thank you.

psc-g commented 3 years ago

hi, modifying dopamine for multi-agent settings is do-able, but the specifics will vary depending on your use case. if you want to have all agents running on the same environment with shared observation spaces, i would suggest modifying run_experiment.py so you create and maintain multiple agents there.

On Sun, May 2, 2021 at 8:50 AM Farrukh Ali @.***> wrote:

I like the compact implementations of Dopamine, and was looking to apply it to the next step to ALE, which is the multi-agent ALE from PettingZoo https://www.pettingzoo.ml/atari. As far as I have searched on this issue, Dopamine does not support multiple agents as per #23 https://github.com/google/dopamine/issues/23. However, the Hanabi Learning Environment [paper https://arxiv.org/pdf/1902.00506.pdf] [code https://github.com/deepmind/hanabi-learning-environment] [post http://www.marcgbellemare.info/blog/a-cooperative-benchmark-announcing-the-hanabi-learning-environment/] attempts to solve a multiagent problem, and the Rainbow Agent https://github.com/deepmind/hanabi-learning-environment/blob/master/hanabi_learning_environment/agents/rainbow/rainbow_agent.py provided is also based on Dopamine. However, I am not entirely sure that multi-agent environments that use vectorized environments, where each agent is in a separate environment on its own but all agents share the observation space, can use Dopamine's Rainbow Agents off-the-shelf.

My question is:

  1. If I want to adapt Dopamine's Agents to use in a Petting Zoo environment, should I look towards Hanabi's Rainbow Agent?
  2. If not, what changes might be required in Dopamine to make it work in a multi-agent setting? I can attempt to do this if the contributors can provide some guideline. A reference issue is perhaps #110 https://github.com/google/dopamine/issues/110 (as mentioned, using one environment makes it slow. Libraries like Stable Baselines or RLlib which support vector environments are supposedly much faster).

As a side note, I have gone through this https://github.com/google/dopamine/tree/master/docs#modifying-and-extending-agents and the associated colab https://colab.research.google.com/github/google/dopamine/blob/master/dopamine/colab/agents.ipynb but not sure if this is going down the same path as I want.

Thank you.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/175, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMMS7HX5KRZJPBS35A3TLVDBDANCNFSM437L4GTA .

rfali commented 3 years ago

thanks @psc-g , is there a resource already available that I could use to modify run_experiment.py? I have gone through the Dopamine Colabs but I don't think its there.

psc-g commented 3 years ago

we don't have any examples yet, there are likely forks that have done something similar. perhaps this will be useful, it is some code i wrote for a paper of mine https://arxiv.org/abs/1911.09291 where i modified the Runner to fill in a replay buffer and then generate some visualizations: https://github.com/google-research/google-research/blob/master/bisimulation_aaai2020/dopamine/run_experiment.py

hope this helps!

On Thu, May 6, 2021 at 9:34 AM Farrukh Ali @.***> wrote:

thanks @psc-g https://github.com/psc-g , is there a resource already available that I could use to modify run_experiment.py? I have gone through the Dopamine Colabs https://github.com/google/dopamine/tree/master/dopamine/colab but I don't think its there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/175#issuecomment-833527025, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMN5I3N6ZRG5BAC2KZTTMKLFNANCNFSM437L4GTA .

rfali commented 3 years ago

thank you for the reference @psc-g, I will explore it some time soon. Following what you said, I will try to create two Runners here and replace the atari_lib.create_atari_environment with a different gym-compatible multi-agent environment, and go on from there.

I am going to close this issue now but if I build on this or if you come across some Dopamine based multi-agent implementation, please do share.

Thanks!