Closed paulkent-um closed 1 year ago
Backtrace for this issue. I've added a test for this in #334.
(lldb) target create --core "/cores/core.97379"
Core file '/cores/core.97379' (arm64) was loaded.
(lldb) bt
* thread #1
* frame #0: 0x00000001b00c6d98 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x00000001b00fbee0 libsystem_pthread.dylib`pthread_kill + 288
frame #2: 0x00000001afffe680 libsystem_c.dylib`raise + 32
frame #3: 0x00000001b01134a4 libsystem_platform.dylib`_sigtramp + 56
frame #4: 0x000000011eb9200c tmpsgbphos6libnethack.so`walkfrom(x=35, y=15, typ='\0') at mkmaze.c:1195:15 [opt]
frame #5: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=35, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #6: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=37, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #7: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=39, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #8: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=41, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #9: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=43, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #10: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=45, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #11: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=47, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #12: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=49, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #13: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=49, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #14: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=51, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #15: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=53, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #16: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=55, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #17: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=57, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #18: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=59, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #19: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=61, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #20: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=61, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #21: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=61, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #22: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=59, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #23: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=59, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #24: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=57, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #25: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=55, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #26: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=53, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #27: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=51, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #28: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=51, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #29: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=51, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #30: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=49, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #31: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=47, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #32: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=45, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #33: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=45, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #34: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=45, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #35: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=43, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #36: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=43, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #37: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=43, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #38: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=41, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #39: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=39, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #40: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=37, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #41: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=37, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #42: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=35, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #43: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=33, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #44: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=31, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #45: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=29, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #46: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=29, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #47: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=27, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #48: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=27, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #49: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=25, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #50: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=25, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #51: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=25, y=7, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #52: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=25, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #53: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=23, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #54: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=21, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #55: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=19, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #56: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=17, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #57: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=15, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #58: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=15, y=7, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #59: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=17, y=7, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #60: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=17, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #61: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=15, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #62: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=13, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #63: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=11, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #64: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=11, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #65: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=13, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #66: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=13, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #67: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=11, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #68: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=9, y=15, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #69: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=9, y=13, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #70: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=9, y=11, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #71: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=9, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #72: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=11, y=9, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #73: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=11, y=7, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #74: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=13, y=7, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #75: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=13, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #76: 0x000000011eb91ef0 tmpsgbphos6libnethack.so`walkfrom(x=11, y=5, typ='\x18') at mkmaze.c:1199:9 [opt]
frame #77: 0x000000011ec357ec tmpsgbphos6libnethack.so`sp_level_coder [inlined] spo_mazewalk(coder=0x0000600003de8100) at sp_lev.c:4793:5 [opt]
frame #78: 0x000000011ec35714 tmpsgbphos6libnethack.so`sp_level_coder(lvl=<unavailable>) at sp_lev.c:5494:13 [opt]
frame #79: 0x000000011ec2d8e8 tmpsgbphos6libnethack.so`load_special(name=<unavailable>) at sp_lev.c:6055:18 [opt]
frame #80: 0x000000011eb92248 tmpsgbphos6libnethack.so`makemaz(s=<unavailable>) at mkmaze.c:1014:13 [opt]
frame #81: 0x000000011eb8aef8 tmpsgbphos6libnethack.so`mklev at mklev.c:0 [opt]
frame #82: 0x000000011eb8acd4 tmpsgbphos6libnethack.so`mklev at mklev.c:1004:5 [opt]
frame #83: 0x000000011eb0f2c0 tmpsgbphos6libnethack.so`goto_level(newlevel=0x0000000104e57748, at_stairs=<unavailable>, falling='\0', portal='\0') at do.c:1448:9 [opt]
frame #84: 0x000000011eb0fe54 tmpsgbphos6libnethack.so`deferred_goto at do.c:1756:9 [opt]
frame #85: 0x000000011ec453ec tmpsgbphos6libnethack.so`level_tele at teleport.c:1025:9 [opt]
frame #86: 0x000000011eaee318 tmpsgbphos6libnethack.so`wiz_level_tele at cmd.c:946:9 [opt]
frame #87: 0x000000011eaf17d8 tmpsgbphos6libnethack.so`rhack(cmd="\U00000016") at cmd.c:4929:23 [opt]
frame #88: 0x000000011eac4574 tmpsgbphos6libnethack.so`moveloop(resuming=<unavailable>) at allmain.c:0 [opt]
frame #89: 0x000000011ec8d53c tmpsgbphos6libnethack.so`unixmain(argc=1, argv=0x0000000104e57fd0) at unixmain.c:354:5 [opt]
frame #90: 0x000000011ebbda68 tmpsgbphos6libnethack.so`mainloop(ctx_transfer=<unavailable>) at nle.c:195:5 [opt]
frame #91: 0x000000011ec9f928 tmpsgbphos6libnethack.so`make_fcontext at make_arm64_aapcs_macho_gas.S:60
This turns out to be an issue of a literal stack overflow (for our stack-on-the-heap that we context switch to) due to a larger than usual stack. Will update #334 with a fix.
🐛 Bug
Something about the NLE hits a bus error some of the time that the agent enters a floor between 10 and 12 of the main dungeon, or 7-9 in dungeon 2 (which I think is the Gnomish Mines?)
To Reproduce
Steps to reproduce the behavior:
Environment
NLE version: 0.8.1 PyTorch version: 1.11.0 Is debug build: No CUDA used to build PyTorch: None
OS: Mac OSX 12.1 GCC version: Could not collect CMake version: Could not collect
Python version: 3.8 Is CUDA available: No CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA
Versions of relevant libraries: [pip3] msgpack-numpy==0.4.8 [pip3] numpy==1.22.4 [pip3] torch==1.11.0 [conda] Could not collect
Additional context
If there's any more information you need, I'll try my best to provide it, but my ability to troubleshoot this problem is very limited. The bus error crashes my debugger without giving me a stack trace or anything, and execution seems to get passed to _pynethack.cpython-38-darwin.so, which is not a file format my IDE supports.