atari-py is based on old version of ALE

sygi commented 8 years ago

The code of ALE, which is included in atari-py package is based on an old version of ALE. One difference I observed is in the file ale_interface/src/games/supported/Breakout.cpp, which was changed in ALE nearly a year ago, but I am sure there are more discrepancies. I feel hesitant to change this one, as then the two repos'll diverge and things will get only worse. Could you think of the way to keep the two repositories synchronized? (e.g. by including ALE as a git submodule and possibly updating it every half a year or so)

ppwwyyxx commented 8 years ago

Agreed. It is not wise to keep an independent branch and not updating with the upstream. This will discard a lot of available options in atari. For example, ALE recently introduced the option "repeat_action_probability" that is used by some latest papers. Prior to this year, DeepMind papers treat live lost as the end of episode to make it easier to learn, but latest papers didn't use this hack.

Same thing for doom-py, it's not tracking upstream either.

gdb commented 8 years ago

I agree there's a real and somewhat unfortunate issue with snapshotting upstream.

However, there's a deeper issue here, which is that sometimes upstream changes will causes changes in semantics. This could break comparability across an environment version.

I'm not sure what the right balance is here. Perhaps we should publish different libraries (such as "atari_py_1" or something) corresponding to frozen versions of upstream. @joschu what do you think?

joschu commented 8 years ago

I think there's no reason to update ALE, as it already works. Doom is being actively developed, so that's more debatable. Maybe doom-py should be put in the gym codebase.

shelhamer commented 7 years ago

I have two arguments for updating ALE (at least once, but presumably every now and again):

Fixes, such as highlighted for Breakout. These may alter semantics, and so they could require a version difference and would have to be incorporated with care.
New functionality, like loading/store system state (include pseudorandomness) that is useful for experiments that need to restore past states. This in particular makes no difference in the environment semantics. I'm raising this issue b.c. I have experiments that I'd like to port from direct ALE to the gym that require the decodeState(), restoreSystemState() methods.

For mechanism, I think a submodule is reasonable. This is what I've been doing in my own projects and it hasn't been any trouble. I'd be happy to help with this switch-over. Furthermore, the submodule preserves the versioning and authorship information from upstream, and so gives full attribution to the work.

@joschu @gdb what are your latest thoughts on updating the ALE?

gdb commented 7 years ago

@shelhamer I like it. We can always keep around the current version of atari-py for comparison with the existing algorithms. (Worst case, we'll end up creating multiple atari-py packages, so it's easy to have multiple version installed at once.)

If you're up for doing the work, I'd be very happy to support you. (Happy to give you commit bits!)

gdb commented 7 years ago

(Started doing some work in the above commit, but it's definitely not working properly.)

fgvbrt commented 7 years ago

Hi, I also was comparing current gym ALE with last ALE when I was trying to implement planning in atari games. Specifically I found some strange behavior when calling cloneState/saveState and loadSate/restoreState for gym's ALE. For example if you create two exact same examples of ale with repeat_action probability==0. Then in one example call cloneState/saveState then loadSate/restoreState then act and for other just act with the same action and after it compare RAM and image states, they will be different. If you do same operations for latest ALE you won't see differences. More extreme case is when you run whole episode for both environments with same actions but for one environment you call cloneState and restoreState after each step. I did this for skiing environment with action==noop and the one where I call cloneState and restoreState always finishes episode faster. So my conclusion was that with gym's ALE you can not do planning now.

I also fount that for latest ALE in order to obtain screen RGB you should provide buffer with size wh3, not 4 like in gym's ALE. And I noticed that pixel values are slightly different for latest ALE compared to gym's ALE, when creating two environments with the same parameters and doing same actions, RAM states in that case are the same. Hope it helps.

shelhamer commented 7 years ago

Right, sorry for the hold-up on this. As the parts of ALE pulled into atari-py made it via another project—and discarded versioning—the fix isn't free of hassle. I'll take another stab at this soon all the same since I have experiments that make use of restoreSystemState() that currently have to make direct use of ALE.

sygi commented 7 years ago

Thanks for working on that :) Note that it does not introduce sync-able submodules, but rather updates atari-py to the current (for now) version -- if you're reading this comment from the future, the ALE and atari-py versions may still be divergent (again).

shelhamer commented 7 years ago

See openai/atari-py#15 to track the status of atari-py with respect to recent ALE development and the inclusion of full snapshot + restore.

openai / atari-py

atari-py is based on old version of ALE #2