openai / universe-starter-agent

A starter agent that can solve a number of universe environments.
MIT License
1.1k stars 318 forks source link

Env compatibility with PyGame (PLE) #92

Closed theweaklink closed 7 years ago

theweaklink commented 7 years ago

Hello,

I have been trying to plug PyGame environment with the A3C agent but there is something odd which I haven't found an explanation for, maybe someone can point out what is wrong?

I got gym_ple to use PLE as a gym env. Then I made a env processing pipe similar to the Atari one:

In envs.py:

def create_env(env_id, client_id, remotes, **kwargs):
    spec = gym.spec(env_id)

    if spec.tags.get('flashgames', False):
        return create_flash_env(env_id, client_id, remotes, **kwargs)

    elif spec.tags.get('atari', False) and spec.tags.get('vnc', False):
        return create_vncatari_env(env_id, client_id, remotes, **kwargs)

    elif spec.tags.get('pygame', False):
        return create_pygame_env(env_id)

    else:
        # Assume atari.
        assert "." not in env_id  # universe environments have dots in names.
        return create_atari_env(env_id)

def create_pygame_env(env_id):
    env = gym.make(env_id)
    env = Vectorize(env)
    env = PyGameProcess(env)
    env = DiagnosticsInfo(env)
    env = Unvectorize(env)
    return env

def _process(frame):
    # pygame frame is 48x48x3, no need to resize and/or crop
    frame = frame.mean(2)
    frame = frame.astype(np.float32)
    frame *= (1.0 / 255.0)
    frame = np.reshape(frame, [48, 48, 1])
    return frame

class PyGameProcess(vectorized.ObservationWrapper):
    def __init__(self, env=None):
        super(PyGameProcess, self).__init__(env)
        self.observation_space = Box(0.0, 1.0, [48, 48, 1])

    def _observation(self, observation_n):
        return [_process(observation) for observation in observation_n]

[...]

If I run the agent with Pong. PyGame Pong frame is much simpler to process than Pong Atari, there is no score box, no frame, nothing to clutter the image and the game is just black and white instead of brown background in Atari.

Here are the results for 6 workers on my machine (based on Tensorboard graphs and visualizing actual games):

I checked the frame that is sent from Gym PLE and everything looks good:

What am I missing?

I used Pong as an example because I can directly compare the Atari game and the PyGame one but none of the simple PyGame I have tried (pixelcopter, catcher) seem to be working well, hence my conclusion: there is something wrong with my code to link PyGame with the A3C agent.

Any insight? I assume/think adding link to other env could be useful to other people too, right?

thanks!!

tlbtlbtlb commented 7 years ago

One possibility is the frame rate. The gym-Atari and vnc-Atari envs run at 15 fps and 10 fps respectively. If the PyGame env runs at 60 fps (meaning, 60 action/observations per second of game time, not talking about wall time here) then the starter-agent may not be able to learn it. Try adding, in the wrapper's .step function, 4 calls to the underlying .step to get to 15 fps.

theweaklink commented 7 years ago

Thank you Trevor, you are touching a key point here. Reducing the fps is just the beginning: game (ball velocity and movement speed) must be adjusted accordingly otherwise the agent seemed to be overwhelmed too.

I haven't found the best set of parameters yet for a given fps of 15 but it is heading into the right direction. There are also some differences between Atari Pong and PyGame Pong which I didn't pay much attention too but are clearly impacting the game result:

Bottomline:

Thank you very much Trevor, your insight was very helpful!