Sequential Multi-agent PPO with DR

Hi, I have a few doubts wrt to implementing multiple agents in Isaac Gym (or Brax). (apologies if they are too trivial)

I want to use 2 or more agents in the same experiment (agents will have different environments, especially if Domain Randomisation is enabled) and train them sequentially (i.e. first Agent 1 gets trained via PPO, then Agent 2 and so on...)

How can I go about implementing this? I am not sure which files I should be modifying and how to configure train.py to support the above functionality.

Thanks!

Denys88 / rl_games

Sequential Multi-agent PPO with DR #136