allenai / cordial-sync

cordial-sync is a software package than can be used to reproduce the results from the paper "A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks”
Other
37 stars 1 forks source link

The training time. #2

Closed liuxz-cs closed 3 years ago

liuxz-cs commented 3 years ago

Excuse me, how long did it take to train the model for the example sample with max_ep 1000000 ? Did there exist some methods improving the training speed ? For example, training in a offline way?

unnat commented 3 years ago

The training times vary across tasks:

I agree with your recommendation. It would be interesting, for FurnLift, to collect expert rollouts and training in an offline way using behavior cloning. Since defining an expert for three-body task of FurnMove is difficult, it isn't as straightforward for it.

Let us know if you have any follow-ups.

liuxz-cs commented 3 years ago

The training times vary across tasks:

  • FurnLift: This task is trained with a mix of imitation learning and reinforcement learning. Since there is dense supervision in terms of a cross-entropy loss, we can train agents effectively in 100k episodes ~12 hours of training if we use ~45 workers for A3C. For FurnMove calculating shortest path experts is intractable (check our paper for details).
  • FurnMove-Gridworld: This is a great place to prototype your models. You can train agents to 1Mn episodes in 1 day.
  • FurnMove: The visual counterpart is obviously slow (due to rendering in visual-AI2-THOR and took us 4 days.

I agree with your recommendation. It would be interesting, for FurnLift, to collect expert rollouts and training in an offline way using behavior cloning. Since defining an expert for three-body task of FurnMove is difficult, it isn't as straightforward for it.

Let us know if you have any follow-ups.

okay, thanks a lot.