Farama-Foundation / Arcade-Learning-Environment

The Arcade Learning Environment (ALE) -- a platform for AI research.
https://ale.farama.org/
GNU General Public License v2.0
2.14k stars 420 forks source link

Bug in skip_frame feature #18

Closed spragunr closed 11 years ago

spragunr commented 11 years ago

Rewards are not handled correctly when skip_frame is greater than 1. StellaEnvironment::act in stella_environment.cpp correctly returns the sum of rewards for the skipped frames, but the controllers in /src/controllers ignore that return value. Instead they directly access m_settings->getReward() which returns the reward for the latest frame only. You can observe this by running, for example:

./ale -frame_skip 5 -display_screen true roms/asterix.bin

The final rewards on the game screen will be greater than the rewards reported in the terminal.

mhauskn commented 11 years ago

Thanks for pointing this issue out! I think the best way of handling it would be to change the applyActions method in ale_controller to return a reward_t (probably zero for reset/save/load actions) which is the reward it got from environment.act(). The associated fifo/internal/rl_glue controllers would have to be changed as well.

It seems to me that it would be preferable to use the StellaEnvironment to get rewards wherever possible rather than going under stella environment and calling m_settings->getReward(). What do you think Marc?

spragunr commented 11 years ago

That's exactly what I ended up doing in my local copy, and it seems to work fine. If you want, I'm happy to pass along my fixed ale_controller. My version of the rl_glue controller has the fix as well, but it is horribly hacked up in other ways, so I doubt you would want that. Nice work on ALE by the way! Thanks for distributing it.

mhauskn commented 11 years ago

I've written some code to fix the issue and submitted it as a pull request to the repo. It sounds like you've already fixed the code on your end so worries there. Hopefully the code in the pull request is basically the same as your fixed ale_controller. Take a look if you want and let me know if anything doesn't look right. Thanks for reporting the bug!

spragunr commented 11 years ago

The changes look correct to me. Thanks for working on a fix.

mgbellemare commented 11 years ago

Can this issue be closed?

mhauskn commented 11 years ago

Yes On Aug 12, 2013 10:52 AM, "Marc G. Bellemare" notifications@github.com wrote:

Can this issue be closed?

— Reply to this email directly or view it on GitHubhttps://github.com/mgbellemare/Arcade-Learning-Environment/issues/18#issuecomment-22503222 .

spragunr commented 11 years ago

I closed it.

On 08/12/2013 11:50 AM, Marc G. Bellemare wrote:

Can this issue be closed?

— Reply to this email directly or view it on GitHub https://github.com/mgbellemare/Arcade-Learning-Environment/issues/18#issuecomment-22503222.