openai / retro

Retro Games in Gym
MIT License
3.35k stars 524 forks source link

Determinism #114

Open christopherhesse opened 5 years ago

christopherhesse commented 5 years ago

Issue summary

Stella set_state seems to be a little odd:

import retro

for game in sorted(retro.data.list_games()):
    if not game.endswith('-Atari2600'):
        continue

    env = retro.make(game=game)

    act = env.action_space.sample()

    env.reset()
    initial_state = env.em.get_state()
    env.em.set_state(initial_state)
    env.step(act)
    ram1 = env.get_ram()

    env.reset()
    env.step(act)
    ram2 = env.get_ram()

    if not (ram1 == ram2).all():
        print('failed', game)

    env.close()

This prints:

failed Jamesbond-Atari2600
failed MontezumaRevenge-Atari2600

I would expect this to not fail, but I might be using set_state() incorrectly.

System information

christopherhesse commented 5 years ago

@endrift is this likely to be some bug in retro_serialize/retro_unserialize in libretro-stella? I noticed some stella workarounds in emulator.cpp: https://github.com/openai/retro/blob/71241e73c9cbdb7cc0a842f324ec437a81ebb359/src/emulator.cpp#L149

Actually it also fails on AsterixAndTheGreatRescue-Genesis, am I missing something and this is actually not supposed to work?

christopherhesse commented 5 years ago

I think I might be mis-using set_state(), closing for now.

christopherhesse commented 5 years ago

Actually @endrift it looks like this is the only way to restore states in a reasonable way, am I missing something?

christopherhesse commented 5 years ago

I made a more elaborate test script: https://gist.github.com/christopherhesse/f5c8cda9ab40e62cdddbc31ff9802594

Here are the failures, ignoring other known issues:

failed: AddamsFamily-GameBoy (failed ram)
failed: BakuretsuSenshiWarrior-GameBoy (failed ram)
failed: Jamesbond-Atari2600 (failed ram)
failed: MontezumaRevenge-Atari2600 (failed ram)
failed: Pong-Atari2600 (failed ram)
failed: SuperMarioWorld2-Snes (failed ram)
christopherhesse commented 5 years ago

According to endrift, updating stella might fix this.

endrift commented 5 years ago

There's a crash in libretro being tracked by https://github.com/libretro/stella-libretro/issues/44. I can try fixing that to see if it also fixes this.

christopherhesse commented 5 years ago

Thanks for pushing that fix, but the ram values are still not what my script predicts they should be, so either my script is broken or else there's some stella non-determinism going on.

christopherhesse commented 5 years ago

I updated the script to compare to the ALE versions: https://gist.github.com/christopherhesse/c55c1b2e130eea06e45080d092e78951

The results show that ALE passes this test fine, while retro does not:

python retro-get-state-set-state.py --pattern "*pong*"
Pong-Atari2600
failed ram
ale:pong
failed: Pong-Atari2600
python retro-get-state-set-state.py --pattern "*montezuma*"
MontezumaRevenge-Atari2600
failed ram
failed ram
failed ram
ale:montezuma_revenge
failed: MontezumaRevenge-Atari2600

Looks like elevator action has some known issues with ALE: https://github.com/openai/gym/blob/master/gym/envs/__init__.py#L445

There's like two different kinds of state restore used in ALE: https://github.com/openai/gym/blob/master/gym/envs/atari/atari_env.py#L161

endrift commented 5 years ago

The second one only works due to them messing around with the internals of Stella, which we haven't done. I looked into this at one point.

christopherhesse commented 5 years ago

Ah good to know, so do you think it's likely that all of these are issues with state not being entirely saved by the individual cores?

endrift commented 5 years ago

Could be. Or being loaded properly.

christopherhesse commented 5 years ago

ALE may not be fully deterministic either: https://github.com/mgbellemare/Arcade-Learning-Environment/issues/246

christopherhesse commented 4 years ago

Here's an idea for a deterministic wrapper that may work for non-lua-based scenarios: https://gist.github.com/christopherhesse/fd8a6593df61bafb4af6ef8dcdca11b2

christopherhesse commented 4 years ago

Added this to the examples: https://github.com/openai/retro/commit/3d153b9601f5b71b4ca6ed41653ad5f8cff3e4b6

It's unclear why Stella is so slow to restore a state, but GameBoy games hanging is a known issue: https://github.com/openai/retro/issues/116