ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.16k stars 5.61k forks source link

[rllib] MultiAgentVectorEnv #4938

Closed bencbartlett closed 3 years ago

bencbartlett commented 5 years ago

Hi all,

I started using ray for a reinforcement learning project I am working on, and it's a fantastic library! However, I've run into a (probably very niche) design issue and was wondering if someone could point me in the right direction.

The project I am working on is to use self-play methods to train models to interact with a complex multi-agent environment I have built around the Screeps RTS, which is somewhat similar to StarCraft. Interactions happen in "rooms", and there can be many rooms run on a server.

Most of the work I have done for the project is actually implementing the environment code. I have a node process which runs the server backend, and a ScreepsInterface Python class which talks to the node process over RPC. The server backend processes add a lot of overhead, so it is much faster to simulate 100 rooms on a single server than it is to run 100 servers each with 1 room. I run one server process per worker, and envs_per_worker rooms per server, which are abstracted to seem like separate environments, even though they all run on the same server.

I started off with baby steps by writing a single-agent version of the environment conforming to gym.Env which only supported one agent acting in one room. I then wrote a single-agent VectorEnv class which started num_envs environments but passed them the same ScreepsInterface instance, so that each env would have a different room which runs in the same server process (rather than starting up num_envs different server processes, which is what auto-vectorization would do). This increased my episode collection rate by about 10x!

I then wrote the multi-agent version of the non-vectorized environment conforming to MultiAgentEnv. However, I am stuck trying to figure out how to vectorize the multi-agent version analogously to the single-agent version. Ideally, there would be a MultiAgentVectorEnv class in rllib which would behave exactly as a multi-agent version of VectorEnv, allowing me to specify how parallel environments are constructed. However, I haven't found such a class, and since I need to be able to pass in an existing ScreepsInterface instance into the MultiAgentEnv constructor, I'm not sure how to vectorize the environment in the way I need to.

Any thoughts on how to customize vectorization for such a multi-agent environment would be appreciated!

ericl commented 5 years ago

Maybe try BaseEnv?

bencbartlett commented 5 years ago

@ericl I had considered that, but I also am hoping to be able to use QMIX with MultiAgentEnv.with_agent_groups(). Do you foresee any issues if I make a class which inherits from both BaseEnv and MultiAgentEnv?

ericl commented 5 years ago

I think you can implement the grouping method on top of BaseEnv (multi agent envs are already a specialization of BaseEnv). The current implementation of with_agent_groups doesn't work with BaseEnv directly.

stale[bot] commented 3 years ago

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

You can always ask for help on our discussion forum or Ray's public slack channel.

stale[bot] commented 3 years ago

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

tianyma commented 2 years ago

Hi @bencbartlett, is your multi-agent reinforcement learning environment project of screeps still processing on?

bencbartlett commented 2 years ago

@tianyma I'm no longer actively working on that project, but the code is still around in a private repository

tianyma commented 2 years ago

@tianyma I'm no longer actively working on that project, but the code is still around in a private repository

@bencbartlett I am currently trying to rewrite the source code of the server to make it a reinforcement learning, but I am a new javascripter and encountered some obstacles, would you do me a favor, very thankful if you are interested. Contact at himarsmty@gmail.com.