Farama-Foundation / Arcade-Learning-Environment

The Arcade Learning Environment (ALE) -- a platform for AI research.
https://ale.farama.org/
GNU General Public License v2.0
2.16k stars 422 forks source link

Segfault in Montezuma Revenge #11

Open mhauskn opened 11 years ago

mhauskn commented 11 years ago

Found an odd segfault in Montezuma's Revenge. A specific sequence of actions causes the game to segfault. The sequence of actions is: take noop actions until frame 49126 at which point the agent should take action 17. Segfault promptly results. All of these actions are legal actions. The bug can be replicated most easily by replacing some of the code in Random agent and then running the random agent on montezuma's revenge.

Random Agent:

#include "RandomAgent.hpp"
#include "random_tools.h"

RandomAgent::RandomAgent(OSystem* _osystem, RomSettings* _settings) : 
    PlayerAgent(_osystem, _settings) {
}

Action RandomAgent::act() {
    static int frameNum = 0;
    frameNum++;
    Action a = Action(0);//legal_actions[rand() % legal_actions.size()];
    if (frameNum == 49126)
        a = Action(17);
    //return choice(&available_actions);
    return a;
}

Output from run:

Arcade-Learning-Environment$ ./ale -player_agent random_agent -max_num_frames 50000 ~/projects/ale-assets/roms/montezuma_revenge.bin 
A.L.E: Arcade Learning Environment (version 0.3)
[Powered by Stella]
Use -help for help screen.
Game console created:
  ROM file:  /home/matthew/projects/ale-assets/roms/montezuma_revenge.bin
  Cart Name: Montezuma's Revenge - Starring Panama Joe (1983) (Parker Bros)
  Cart MD5:  3347a6dd59049b15a38394aa2dafa585
  Display Format:  AUTO-DETECT ==> NTSC
  ROM Size:        8192
  Bankswitch Type: AUTO-DETECT ==> E0

Running ROM file...
Random Seed: Time
Game will be controlled internally.
Segmentation fault

Backtrace of the segfault shows that it's segfaulting in the nitty gritty emulator code:

Program received signal SIGSEGV, Segmentation fault.
0x000000000043452b in TIA::poke (this=0x7d3a00, addr=17, value=193 '\301') at src/emucore/TIA.cxx:2431
2431          Int8 when = ourPlayerPositionResetWhenTable[myNUSIZ1 & 7][myPOSP1][newx];
(gdb) bt
#0  0x000000000043452b in TIA::poke (this=0x7d3a00, addr=17, value=193 '\301') at src/emucore/TIA.cxx:2431
#1  0x000000000045ece0 in System::poke (this=0x7d2630, addr=273, value=193 '\301')
    at src/emucore/m6502/src/System.cxx:341
#2  0x000000000044cb2d in M6502Low::poke (this=0x7d35c0, address=273, value=193 '\301')
    at src/emucore/m6502/src/M6502Low.cxx:72
#3  0x000000000044b8d3 in M6502Low::execute (this=0x7d35c0, number=19580) at src/emucore/m6502/src/M6502Low.ins:4205
#4  0x000000000042f7ec in TIA::update (this=0x7d3a00) at src/emucore/TIA.cxx:516
#5  0x0000000000423941 in OSystem::mainLoop (this=0x7cea70) at src/emucore/OSystem.cxx:748
#6  0x000000000040809a in main (argc=6, argv=0x7fffffffdff8) at src/main.cpp:171
mgbellemare commented 9 years ago

Hasn't this been fixed now?

mhauskn commented 9 years ago

AFAIK it's still an issue. Albeit a very hard one to trigger. Also not clear how to fix it short of hacking Stella internals.

On Fri, Feb 6, 2015 at 5:51 AM, Marc G. Bellemare notifications@github.com wrote:

Hasn't this been fixed now?

— Reply to this email directly or view it on GitHub https://github.com/mgbellemare/Arcade-Learning-Environment/issues/11#issuecomment-73225861 .

nczempin commented 7 years ago

The line is Int8 when = ourPlayerPositionResetWhenTable[myNUSIZ1 & 7][myPOSP1][newx]; A segfault can come from one of the array indices being outside the "legal" range.

To debug, you can find out the size of the array and include checks before the above line.

It is technically "hacking Stella internals" and obviously a Stella bug, not an ALE one. It may be fixed in the upstream Stella, but I'm not sure if there's even a mechanism or desire to keep in sync.

lucasb-eyer commented 5 years ago

I just had the exact same crash in one of my trainings. I have neither the action-sequence logged, nor file:lineno-level stack trace, but the function-level stack trace looks exactly the same as reported here. I believe it's unlikely my action-sequence was the same as reported here.