Is there a way to remove old and add new instances of agents online without restarting the environment?

ninell-oldenburg commented 11 months ago

Hi!

as the title says, I'm trying to remove agents and add new ones without starting the environment.

I'm calling the clean_up substrate from an alternate version of examples/rllib/view_models.py and would like to remove an agent and add another one while keeping the number of agents in the game constant. It seems removing in reappearing is possible, however, only with the same instance of the agent whereas I would like to create a new agent instance. Is there an elegant way of doing that without heavy engineering? That is, to change the indices of all respective agents in the Lua backend from the calling python file: say I iteratively want to remove the agent with agent_idx = 1 and add an agent with agent_idx = 4 and respectively change all other agents' indices while keeping their respective roles and policies.

For reference, I was looking into meltingpot/lua/levels/clean_up/components.lua and it seems indices are fixed from the start.

I guess the heavy engineering solution would include adding a function that does that for every module but maybe there is an easier way?

Let me know if you need any more information.

Thanks in advance!

duenez commented 11 months ago

I am not aware of any way to do this in RLLib. Unfortunately RLLib doesn't expose the agent creation easily, which means episodes are something that just happens in workers.

However, I have a higher order question: What is it that you are trying to achieve? Learning agents would get fragments of environments anyway (of an unroll length for the LSTM or value bootstrapping). Many parallel episodes would be running and sending those partial trajectories to the learning agent to do updates on. So, how would removing an agent and adding another one help?

One possible solution would be to create a meta-agent (i.e. an agent that has other agents inside). This meta-agent wuold then receive an ID to choose which sub-agent to actually train. This is similar to the Neural Population Learning idea: https://www.deepmind.com/publications/neupl-neural-population-learning but it is definitely not implemented out of the box in RLLib.

duenez commented 11 months ago

Another thought, if you are not using RLLib, or another RL training tool, it's trivially possible to just hook up a player to whichever learning agent you want from one step of the substrate to the next. Of course you'd have to be fully in control of your run-loop in this case, and handle what it means for your agent to stop getting experience and another to start getting experience in the middle of the episode.

ninell-oldenburg commented 10 months ago

Thanks, these are great suggestions!

I was trying to model something like life span in agents: them dying and surrendering but acting as if surrendering is actually them being new, naïve agents. Indeed, I was using RL tools at all but a planning algorithm. Solved it now through a combination of setting all relevant parameters in every agent to some initial parameter set and visually removing the agent by setting it to the respective waitState. It definitely works for our purpose now but might explore more how this can be done more elegantly!

google-deepmind / meltingpot

Is there a way to remove old and add new instances of agents online without restarting the environment? #172