openai / neural-mmo

Code for the paper "Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents"
https://openai.com/blog/neural-mmo/
MIT License
1.59k stars 263 forks source link

Integration with MADDPG #27

Closed kumarict closed 5 years ago

kumarict commented 5 years ago

I'm trying to integrate Neural MMO with MADDPG (https://github.com/openai/maddpg ; https://github.com/openai/multiagent-particle-envs ) and am facing some issues related to this:

Error encountered when running vecenv with no rendering

testlaw128 , NENT: 128 , NPOP: 8 testchaos128 , NENT: 128 , NPOP: 8 sample , NENT: 128 , NPOP: 8 2019-05-10 18:03:39,816 INFO node.py:423 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-05-10_18-03-39_2786/logs. 2019-05-10 18:03:40,032 INFO services.py:363 -- Waiting for redis server at 127.0.0.1:64431 to respond... 2019-05-10 18:03:40,257 INFO services.py:363 -- Waiting for redis server at 127.0.0.1:28843 to respond... 2019-05-10 18:03:40,274 INFO services.py:760 -- Starting Redis shard with 6.86 GB max memory. 2019-05-10 18:03:40,455 INFO services.py:1384 -- Starting the Plasma object store with 10.29 GB memory using /dev/shm. (pid=2818) pygame 1.9.5 (pid=2818) Hello from the pygame community. https://www.pygame.org/contribute.html Traceback (most recent call last): File "Forge.py", line 74, in example.run() File "Forge.py", line 48, in run self.step() File "Forge.py", line 44, in step self.envsObs, rews, dones, infos = self.env.step(actions) File "/home/kumar/Projects/neural-mmo/forge/trinity/smith.py", line 70, in step return self.env.step(actions) File "/home/kumar/Projects/neural-mmo/forge/trinity/smith.py", line 23, in step zip(self.envs, actions)]) File "/home/kumar/.local/lib/python3.6/site-packages/ray/worker.py", line 2307, in get raise value ray.exceptions.RayTaskError: ray_VecEnvRealm:step() (pid=2818, host=dvrs-10) File "/home/kumar/Projects/neural-mmo/forge/blade/core/realm.py", line 176, in step return pickle.dumps((stims, rews, None, None)) File "/home/kumar/Projects/neural-mmo/forge/blade/entity/player.py", line 68, in getattribute return super().getattribute(name) RecursionError: maximum recursion depth exceeded while calling a Python object

kumarict commented 5 years ago

When running Neural MMO from the command line, this does not appear to work when the "-- render" argument is omitted (i.e., just "python Forge.py" is called). It stops after the "Hello from the pygame community" line.

Okay, this is working fine. It just does not display any messages. I added a print statement for every time an entity is spawned and it shows up whether or not --render is applied.

jsuarez5341 commented 5 years ago

Hi! I've been a bit slow on responses for the past couple of weeks -- I'm actually in the middle of a pretty heavy dev phase on the first major patch. It will take a while, but it should make a bunch of the stuff you mentioned a lot easier once it's done. The gist of the issue is that the VecEnv/Gym computation model doesn't really work in this setting -- not only is there too much communication, but the input/output processing on the game state is a bit more complicated than in most RL environment is isn't exactly easy to package into tensors.

The Native model is somewhere in-between Gym and Rapid (the OpenAI infra for DotA -- you can read more about it on their blog). The idea is to have persistent workers, one per CPU core, which each have access to a copy of the environment. The Native API currently does this synchronously -- that is, you make decisions for one agent at a time, The VecEnv API is asynchronous and lets you make all decisions at once (pretty much the same as any other RL environment). This should allow you to centralize decision making across all the agents on that core, if you so desire. The VecEnv example in the repo provides inference code -- you can structure training however you please.

The upcoming patch will be merging the Native and VecEnv APIs in favor of the latter, with the infra on top adopting a variant of the Rapid computation model. Ideally, this should make a bunch more approaches feasible, including centralized training.

A few pointers for the meanwhile if you'd like to hack the VecEnv API. The config and experiments file specify run parameters. You weren't getting any output because default behavior is to load the model and run in evaluation mode -- the print statements only log epoch information, and in test mode, there are none. Feel free to reopen if you run into more issues :)