Rajathbharadwaj / NetHack-2021

Nethack 2021 codebase implementations
0 stars 0 forks source link

Bus Error #10

Open paulkent-um opened 1 year ago

paulkent-um commented 1 year ago

To my frustration, every now and then when I run the agent, this happens:

Terminated due to signal: BUS ERROR (10)

The error message itself helps me approximately not at all. I can't detect this happening in-code and figure out the cause that way, because it's an instadeath for the program. (I don't even think a try-catch block can deal with this, since it's not an exception.) I could probably home in on the problem if I can get the bus error to happen when my IDE (CodeRunner) is in debug mode, but my agent runs several times slower in debug mode and the problem is fairly infrequent, so that's going to be something of a pain.

I'll report more when there's more to report.

paulkent-um commented 1 year ago

Incredible. I actually managed to hit the bus error with my debugger on, and I still didn't get any useful information. The bus error killed the program outright without giving me the chance to see what went wrong, even with the debugger active.

/Users/paulkent/Library/Application Support/CodeRunner/Debuggers/pdb.crDebugger/debugger.sh: line 63: 11966 Bus error: 10 '/usr/bin/python3' "$filename" "${@:2}"

paulkent-um commented 1 year ago

Welp. Looks like it's an error on the part of the NLE rather than anything that's going on in my code. Faaaaaaaantastic. And the cause is going down stairs, so I can't just avoid the thing that makes it die.

paulkent-um commented 1 year ago

Alright. The NLE devs put out a new version of NLE that they say fixes this problem. I'm going to update my version of the NLE a bit later and make sure the fix works.