google-deepmind / meltingpot

A suite of test scenarios for multi-agent reinforcement learning.
Apache License 2.0
584 stars 118 forks source link

How many hours did you take to train agents in each substrate? #15

Closed YetAnotherPolicy closed 2 years ago

YetAnotherPolicy commented 2 years ago

Dear authors,

Thanks for building such ambitious environments for MARL research. In your paper, I found it will take 10^9 steps to run the simulation for each agent. In order to train agents, how many rollout workers did you set and how many hours did you take to get the final results in Table 1: Focal per-capita returns?

Thank you.

duenez commented 2 years ago

Hello,

Estimating training time is very difficult, since it entirely depends on the training stack, available compute, etc. There is typically a fundamental tradeoff between wall-clock time and compute. From our side, we have tried two very different training stacks, and one of them trained populations in a bit under a week, and in another stack it took just one day. The number of workers was also quite different in the two stacks.

We recognise that compute is likely a limiting factor in training these population which is why we are actively working on improving the performance of the substrates, including reducing the time spent in Python, trying instead to delegate to the underlying C++ implementation of the substrate engine (Lab2D) as soon as possible.

Hope this helps

YetAnotherPolicy commented 2 years ago

Dear @duenez, thanks for the detailed and helpful reply. I appreciate your team's efforts to make MeltingPot a great testbed in MARL research.

ManuelRios18 commented 2 years ago

@YetAnotherPolicy I am curious to know how long you take to train these populations!

In my case, I can train 1e^6 steps in almost exactly an hour using 4 RLlib workers and a 64GB RAM machine with Rtx 3060 Nvidia GPU.

ManuelRios18 commented 2 years ago

@YetAnotherPolicy Could you please tell me your hardware specs ? I mean, num CPU’s , RAM, GPU ? or do you train in the cloud ?

YetAnotherPolicy commented 2 years ago

@YetAnotherPolicy I am curious to know how long you take to train these populations!

In my case, I can train 1e^6 steps in almost exactly an hour using 4 RLlib workers and a 64GB RAM machine with Rtx 3060 Nvidia GPU.

Hi in my case I use 32 workers and it will take 8 minutes to run 1M steps. Note that it depends on the simulation speed.

YetAnotherPolicy commented 2 years ago

@YetAnotherPolicy Could you please tell me your hardware specs ? I mean, num CPU’s , RAM, GPU ? or do you train in the cloud ?

I use very common Intel's CPUs, 40 in total. As the states are RGB images. I use A100, which can be faster than 3090. RAM is 256G.

ManuelRios18 commented 2 years ago

@YetAnotherPolicy Sorry! I am back with the questions!

Which algorithm are you using to train? I have notice that in my case PPO is 8 times slower than A3C. Have you experienced anything similar?

YetAnotherPolicy commented 2 years ago

@YetAnotherPolicy Sorry! I am back with the questions!

Which algorithm are you using to train? I have notice that in my case PPO is 8 times slower than A3C. Have you experienced anything similar?

Hi, I use PPO. Note that there is an inner training loop in each update in PPO, see this link: https://github.com/openai/spinningup/blob/master/spinup/algos/pytorch/ppo/ppo.py#L265. Please also check if RLlib uses this trick.

Training with PPO costs 1.5 days for 200M.

yesfon commented 2 years ago

Hello @YetAnotherPolicy,

I got confused for your last message, i would like to know if for the training of the workers you used the RLlib library?

YetAnotherPolicy commented 2 years ago

Hello @YetAnotherPolicy,

I got confused for your last message, i would like to know if for the training of the workers you used the RLlib library?

Hi, @yesfon, I did not use RLlib.

yesfon commented 2 years ago

Hello @YetAnotherPolicy, I got confused for your last message, i would like to know if for the training of the workers you used the RLlib library?

Hi, @yesfon, I did not use RLlib.

May I ask what did you use ?

YetAnotherPolicy commented 2 years ago

Hello @YetAnotherPolicy, I got confused for your last message, i would like to know if for the training of the workers you used the RLlib library?

Hi, @yesfon, I did not use RLlib.

May I ask what did you use ?

Hi, I use multiprocessing as well as ray's remote actor to collect data. RLlib is also good, but it takes a lot of time to learn its APIs.