openai / universe

Universe: a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.
https://universe.openai.com
MIT License
7.45k stars 957 forks source link

Universe envs inside Multiprocessing process do not respond to action steps #175

Closed ethancaballero closed 6 years ago

ethancaballero commented 7 years ago

Expected behavior

I expected Universe env running in a multiprocessing process to respond to action steps issued in that process.

Actual behavior

When running this code the environment make and reset work, but action step has no effect (e.g NeonRace-v0 controlled car doesn't respond to actions) .

import multiprocessing as mp

import gym
import universe

def worker(rank):
    env = gym.make('flashgames.NeonRace-v0')
    env.configure(remotes=1, client_id=rank)
    observation_n = env.reset()
    while True:
        action_n = [[('KeyEvent', 'ArrowUp', True)] for ob in observation_n]
        observation_n, reward_n, done_n, info = env.step(action_n)

num_processes = 1
processes = []
for rank in range(num_processes):
    p = mp.Process(target=worker, args=(rank,))
    p.start()
    processes.append(p)
for p in processes:
    p.join()

^You need to run this code on Linux. Running on Mac OS X will cause error with Twisted because OS X only uses file systems (not file descriptors) for multiprocessing.

Versions

Please include the result of running

$ uname -a ; python --version; pip show universe gym tensorflow numpy go-vncdriver Pillow
Linux 314e099d28d4 4.4.41-moby #1 SMP Wed Jan 11 01:09:58 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Python 3.5.2
Name: gym
Version: 0.8.1
Summary: The OpenAI Gym: A toolkit for developing and comparing your reinforcement learning agents.
Home-page: https://github.com/openai/gym
Author: OpenAI
Author-email: gym@openai.com
License: UNKNOWN
Location: /usr/local/lib/python3.5/dist-packages
Requires: numpy, six, pyglet, requests
---
Name: numpy
Version: 1.11.0
Summary: NumPy: array processing for numbers, strings, records, and objects.
Home-page: http://www.numpy.org
Author: NumPy Developers
Author-email: numpy-discussion@scipy.org
License: BSD
Location: /usr/lib/python3/dist-packages
Requires:
---
Name: go-vncdriver
Version: 0.4.19
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /usr/local/lib/python3.5/dist-packages
Requires: numpy
---
Name: Pillow
Version: 4.1.0
Summary: Python Imaging Library (Fork)
Home-page: https://python-pillow.org
Author: Alex Clark (Fork Author)
Author-email: aclark@aclark.net
License: Standard PIL License
Location: /usr/local/lib/python3.5/dist-packages
Requires: olefile

^I'm running this in default docker instance provided by universe repo

ethancaballero commented 7 years ago

Found a fix in this stackoverflow response: http://stackoverflow.com/a/11283425

It's possible you can avoid this issue by not loading any of Twisted until you've already created the child processes. This would turn your usage into a single-process use case as far as Twisted is concerned (in each process, it would be initially loaded, and then that process would not go on to fork at all, so there's no question of how fork and Twisted interact anymore). This means not even importing Twisted until after you've created the child processes.

So for multiprocessing + universe, that would mean this:

import multiprocessing as mp

def worker(rank):
    import gym
    import universe
    env = gym.make('flashgames.NeonRace-v0')
    env.configure(remotes=1, client_id=rank)
    observation_n = env.reset()
    while True:
        action_n = [[('KeyEvent', 'ArrowUp', True)] for ob in observation_n]
        observation_n, reward_n, done_n, info = env.step(action_n)

num_processes = 1
processes = []
for rank in range(num_processes):
    p = mp.Process(target=worker, args=(rank,))
    p.start()
    processes.append(p)
for p in processes:
    p.join()

^which works

You might want to add this to the "Solutions to common problems" wiki.