team-eternity / eternity

The Eternity Engine
http://eternity.youfailit.net/wiki/Main_Page
GNU General Public License v3.0
237 stars 37 forks source link

Crash when using ENDOOM on Ubuntu #583

Closed ioan-chera closed 1 year ago

ioan-chera commented 1 year ago

User of Ubuntu 23.04 with GNOME (default GUI) on Xorg reported Eternity crashing on exit. When asking them about disabling ENDOOM from the settings, they reported that Eternity stopped crashing after that. So ENDOOM may be problematic (again — previously it happened on macOS). Need to fix it.

joanbm commented 1 year ago

Hmmm, I can reproduce a crash on ENDOOM on a Arch Linux VM with Sway and the latest master, with the following stack trace:

Thread 1 "eternity" received signal SIGSEGV, Segmentation fault.
llvm::PMTopLevelManager::addImmutablePass () at /usr/src/debug/llvm/llvm-15.0.7.src/lib/IR/LegacyPassManager.cpp:809
809  for (const PassInfo *ImmPI : PassInf->getInterfacesImplemented())                                                             
(gdb) bt
#0  llvm::PMTopLevelManager::addImmutablePass () at /usr/src/debug/llvm/llvm-15.0.7.src/lib/IR/LegacyPassManager.cpp:809
#1  0x00007fffed0f39ca in llvm::PMTopLevelManager::schedulePass ()
    at /usr/src/debug/llvm/llvm-15.0.7.src/lib/IR/LegacyPassManager.cpp:737
#2  0x00007fffed356ca0 in addPassesToGenerateCode () at /usr/src/debug/llvm/llvm-15.0.7.src/lib/CodeGen/LLVMTargetMachine.cpp:115
#3  0x00007fffed35a63f in llvm::LLVMTargetMachine::addPassesToEmitMC ()
    at /usr/src/debug/llvm/llvm-15.0.7.src/lib/CodeGen/LLVMTargetMachine.cpp:260
#4  0x00007fffef1becf7 in llvm::MCJIT::emitObject () at /usr/src/debug/llvm/llvm-15.0.7.src/lib/ExecutionEngine/MCJIT/MCJIT.cpp:167
#5  0x00007fffef1bf36e in llvm::MCJIT::generateCodeForModule ()
    at /usr/src/debug/llvm/llvm-15.0.7.src/lib/ExecutionEngine/MCJIT/MCJIT.cpp:210
#6  0x00007fffef1bae80 in llvm::MCJIT::finalizeObject () at /usr/src/debug/llvm/llvm-15.0.7.src/lib/ExecutionEngine/MCJIT/MCJIT.cpp:268
#7  0x00007fffef13356f in LLVMGetPointerToGlobal ()
    at /usr/src/debug/llvm/llvm-15.0.7.src/lib/ExecutionEngine/ExecutionEngineBindings.cpp:297
#8  0x00007fffd9ed8341 in gallivm_jit_function () at ../mesa-23.0.3/src/gallium/auxiliary/gallivm/lp_bld_init.c:722
#9  generate_variant () at ../mesa-23.0.3/src/gallium/drivers/llvmpipe/lp_state_fs.c:3917
#10 llvmpipe_update_fs () at ../mesa-23.0.3/src/gallium/drivers/llvmpipe/lp_state_fs.c:4671
#11 0x00007fffd9eda560 in llvmpipe_update_derived () at ../mesa-23.0.3/src/gallium/drivers/llvmpipe/lp_state_derived.c:289
#12 0x00007fffd9eb1c98 in llvmpipe_draw_vbo () at ../mesa-23.0.3/src/gallium/drivers/llvmpipe/lp_draw_arrays.c:77
#13 0x00007fffd9af4825 in _mesa_draw_arrays () at ../mesa-23.0.3/src/mesa/main/draw.c:1202
#14 0x00007ffff7e4ddaa in GL_RunCommandQueue (renderer=0x5555567ce3f0, cmd=<optimized out>, vertices=0x55555a731ae0, 
    vertsize=<optimized out>) at /usr/src/debug/sdl2/SDL2-2.26.5/src/render/opengl/SDL_render_gl.c:1393
#15 0x00007ffff7e41ba5 in FlushRenderCommands (renderer=0x5555567ce3f0) at /usr/src/debug/sdl2/SDL2-2.26.5/src/render/SDL_render.c:251
#16 0x00007ffff7e4aa95 in SDL_RenderPresent_REAL (renderer=0x5555567ce3f0)
    at /usr/src/debug/sdl2/SDL2-2.26.5/src/render/SDL_render.c:4341
#17 SDL_RenderPresent_REAL (renderer=0x5555567ce3f0) at /usr/src/debug/sdl2/SDL2-2.26.5/src/render/SDL_render.c:4335
#18 0x00005555557a8e0f in TXT_UpdateScreenArea ()
#19 0x00005555557a8e4f in TXT_UpdateScreen ()
#20 0x000055555579ba92 in I_EndDoom() ()
#21 0x000055555579b529 in I_Quit() ()
#22 0x00007ffff7765066 in __run_exit_handlers (status=0, listp=0x7ffff7904760 <__exit_funcs>, 
    run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:111
#23 0x00007ffff77651b0 in __GI_exit (status=<optimized out>) at exit.c:141
#24 0x000055555566e0e6 in G_QuitDoom() ()
#25 0x000055555566e0ef in Handler_quit() ()
#26 0x00005555555f2c7b in C_DoRunCommand(command_t*, char const*) ()
#27 0x00005555555f4abc in C_RunBufferedCommand(bufferedcmd*) ()
#28 0x00005555555f4ba6 in C_BufferCommand(int, command_t*, char const*, int) ()
#29 0x00005555555f29f1 in C_RunCommand(command_t*, char const*) ()
#30 0x00005555555f27d4 in C_RunIndivTextCmd(char const*) ()
#31 0x00005555555f30e0 in C_RunTextCmd(char const*) ()
--Type <RET> for more, q to quit, c to continue without paging--
#32 0x00005555556af352 in MN_PopupResponder(event_t*, int) ()
#33 0x00005555556a2818 in MN_Responder(event_t*) ()
#34 0x0000555555613fd2 in D_ProcessEvents() ()
#35 0x0000555555619533 in TryRunTics() ()
#36 0x0000555555617a8f in D_DoomMain() ()
#37 0x000055555578f940 in main ()
(gdb) 

The VM is launched like this: qemu-system-x86_64 -nodefaults -enable-kvm -cpu host -smp 4 -m 8192 -vga virtio -nic user -hda ArchEternityGfxTest.qcow2

Strangely enough ENDOOM works on bare metal on the same machine, with an Intel iGPU.

I also got a crash on an AMD APU but I haven't been able to confirm yet that it's the same issue as the stack trace above (though it probably is).

Will take a look to see if I can figure it out.

joanbm commented 1 year ago

I think the problem is that some video graphics drivers on Linux can't be used inside an atexit callback after calling exit(...) because they also clean themselves up with atexit. At least placing a breakpoint on atexit shows various calls like this:

Thread 1 "eternity" hit Breakpoint 1.3, 0x00007fffef38c190 in atexit () from /usr/lib/dri/swrast_dri.so
(gdb) bt
#0  0x00007fffef38c190 in atexit () from /usr/lib/dri/swrast_dri.so
#1  0x00007fffee2a3090 in one_time_init () at ../mesa-23.0.3/src/mesa/main/context.c:224
#2  0x00007ffff77b55bf in __pthread_once_slow (once_control=0x7fffefdda004 <once.0.lto_priv+4>, 
    init_routine=0x7fffeeb12c40 <util_call_once_data_slow_once()>) at pthread_once.c:116
#3  0x00007fffee0c988e in call_once () at ../mesa-23.0.3/src/c11/impl/threads_posix.c:76
#4  util_call_once_data_slow () at ../mesa-23.0.3/src/util/u_call_once.c:29
#5  util_call_once_data () at ../mesa-23.0.3/src/util/u_call_once.h:62
#6  _mesa_initialize () at ../mesa-23.0.3/src/mesa/main/context.c:251
#7  _mesa_initialize () at ../mesa-23.0.3/src/mesa/main/context.c:248
#8  st_api_create_context () at ../mesa-23.0.3/src/mesa/state_tracker/st_manager.c:951
#9  dri_create_context () at ../mesa-23.0.3/src/gallium/frontends/dri/dri_context.c:177
#10 0x00007fffee0cb6fd in driCreateContextAttribs () at ../mesa-23.0.3/src/gallium/frontends/dri/dri_util.c:622
#11 0x00007ffff5a0c171 in dri2_create_context () at ../mesa-23.0.3/src/egl/drivers/dri2/egl_dri2.c:1438
#12 0x00007ffff59ff0fc in eglCreateContext () at ../mesa-23.0.3/src/egl/main/eglapi.c:908
#13 0x00007ffff7ea8bd0 in SDL_EGL_CreateContext (_this=0x555555b40eb0, egl_surface=0x5555567f7a70)
    at /usr/src/debug/sdl2/SDL2-2.26.5/src/video/SDL_egl.c:1047
#14 0x00007ffff7ef5d60 in Wayland_GLES_CreateContext (_this=0x555555b40eb0, window=<optimized out>)
    at /usr/src/debug/sdl2/SDL2-2.26.5/src/video/wayland/SDL_waylandopengles.c:56
#15 0x00007ffff7eb7a00 in SDL_GL_CreateContext_REAL (window=0x5555567f7680)
    at /usr/src/debug/sdl2/SDL2-2.26.5/src/video/SDL_video.c:4081
#16 0x0000555555795623 in SDLGL2DVideoDriver::InitGraphicsMode() ()
#17 0x000055555567be7d in I_InitGraphicsMode() ()
#18 0x000055555567c027 in I_SetMode() ()
#19 0x000055555567bff1 in I_InitGraphics() ()
#20 0x0000555555615f67 in D_SetGraphicsMode() ()
#21 0x0000555555617025 in D_DoomInit() ()
#22 0x0000555555617a8d in D_DoomMain() ()
#23 0x000055555578f97c in main ()
(gdb) 

Potential fix: https://github.com/joanbm/eternity/commit/296a5e946d85339a11ada419c9c4c1e6d308b1b8

I will submit a PR if it works if this also fixes the crash on my system with an AMD APU.

Altazimuth commented 1 year ago

That'd track. Other ports have invented an I_AtExit to deal with atexit not really being suitable (and I even invented the same thing before for an EE fork that never saw the light of day). Might be necessary here.

joanbm commented 1 year ago

Yeah, I tried to find if there's some reference in the C standard, POSIX, SDL, etc. that implies that you can't e.g. create a window inside an atexit handler, but found nothing, so it should in principle be possible to show ENDOOM inside a standard atexit.

However, at least Mesa does various cleanup calls to atexit around so it's doesn't look like you can really do anything on atexit other than just cleaning up your own stuff.