fgsfdsfgs / perfect_dark

work in progress port of n64decomp/perfect_dark to modern platforms
MIT License
1.24k stars 74 forks source link

[partially resolved] Crashes on startup and when clicking in the window with Wayland #136

Open ghost opened 1 year ago

ghost commented 1 year ago

Workflow builds 0942f61 and 56df5b4 are immediately crashing on startup:

Screenshot from 2023-09-02 15-56-03

I am running Fedora 38, 64-bit, gnome (wayland session). The following package versions were installed a few minutes prior:

Screenshot from 2023-09-02 16-00-34

ghost commented 1 year ago

Further to the above, there's some crash info recorded in the system journal and the consistent error is:

"Sep 02 16:25:13 jupiter kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6."

Screenshot from 2023-09-02 16-32-23

Screenshot from 2023-09-02 16-32-50

fgsfdsfgs commented 1 year ago

Can you run it under gdb? The package for it is probably just called gdb, though you might have to install the i686 version. Run like so:

gdb ./pd
r

then wait until it crashes and type bt and hit Enter, then paste the results here.

ghost commented 1 year ago

I've tried multiple builds back to 6a749bd from 3 weeks ago and they all crash.

I will see what gdb tells me.

ghost commented 1 year ago

Here's the backtrace for 6a749bd:

0x00000000 in ?? () (gdb) bt

0 0x00000000 in ?? ()

1 0xf7ee1cf6 in Wayland_VideoInit (_this=0x570e5690)

at /usr/src/debug/SDL2-2.26.3-1.fc38.i386/src/video/wayland/SDL_waylandvideo.c:965

2 0xf7e944d2 in SDL_VideoInit_REAL (driver_name=)

at /usr/src/debug/SDL2-2.26.3-1.fc38.i386/src/video/SDL_video.c:527

3 0xf7dd90b3 in SDL_InitSubSystem_REAL (flags=)

at /usr/src/debug/SDL2-2.26.3-1.fc38.i386/src/SDL.c:253

4 0x567d16d7 in gfx_sdl_init (game_name=0x567fc1c9 "PD", gfx_api_name=0x56807880 "OpenGL",

start_in_fullscreen=false, width=640, height=480, posX=100, posY=100) at port/fast3d/gfx_sdl2.cpp:66

5 0x567e0686 in gfx_init (wapi=0x56879d60 , rapi=0x56879dc0 , game_name=0x567fc1c9 "PD",

start_in_fullscreen=false, width=640, height=480, posX=100, posY=100) at port/fast3d/gfx_pc.cpp:2535

6 0x5677ce31 in videoInit () at port/src/video.c:28

7 0x56783adb in main (argc=1, argv=0xffffd1c4) at port/src/main.c:57

ghost commented 1 year ago

Backtrace for current build 0942f61:

Program received signal SIGSEGV, Segmentation fault. 0x00000000 in ?? () (gdb) bt

0 0x00000000 in ?? ()

1 0xf7ee1cf6 in Wayland_VideoInit (_this=0x56888310)

at /usr/src/debug/SDL2-2.26.3-1.fc38.i386/src/video/wayland/SDL_waylandvideo.c:965

2 0xf7e944d2 in SDL_VideoInit_REAL (driver_name=)

at /usr/src/debug/SDL2-2.26.3-1.fc38.i386/src/video/SDL_video.c:527

3 0xf7dd90b3 in SDL_InitSubSystem_REAL (flags=)

at /usr/src/debug/SDL2-2.26.3-1.fc38.i386/src/SDL.c:253

4 0x567166bd in gfx_sdl_init (game_name=0x56733dc2 "PD", gfx_api_name=0x5673f830 "OpenGL",

start_in_fullscreen=false, width=640, height=480, posX=100, posY=100) at port/fast3d/gfx_sdl2.cpp:68

5 0x5671f185 in gfx_init (wapi=0x567f8e80 , rapi=0x567f8dc0 , game_name=0x56733dc2 "PD",

start_in_fullscreen=false, width=640, height=480, posX=100, posY=100) at port/fast3d/gfx_pc.cpp:2599

6 0x566d2f75 in videoInit () at port/src/video.c:40

7 0x566d6f99 in main (argc=1, argv=0xffffd1c4) at port/src/main.c:83

I have run this project in debian, under gnome's Wayland session on the same hardware as recently as last week. I am wondering whether this is Fedora-specific?

fgsfdsfgs commented 1 year ago

Ah. This has happened before. For unknown reasons 32-bit SDL2 crashes on init when its video driver is set to wayland and I don't know how to fix this. You might need to install xwayland or something. Or just run it like this:

SDL_VIDEODRIVER=x11 ./pd
ghost commented 1 year ago

Partial success! Setting SDL_VIDEODRIVER=x11 produced the following error:

rms@jupiter pd-i686-linux]$ SDL_VIDEODRIVER=x11 ./pd version: 0942f61 (i686-linux) startup date: 02 Sep 2023 17:09:45 ERROR: FATAL: Could not open SDL window: Failed loading libGL.so.1: libGL.so.1: cannot open shared object file: No such file or directory

Which was resolved by installing

mesa-libGL.i686

And now the game runs!

Screenshot from 2023-09-02 17-16-28

It still doesn't crashes without the SDL_VIDEODRIVER=x11 prefix, however, despite both the 64 bit and 32 bit versions of mesa-libGL being installed. I'll take it as a win, though! Thanks for everything, @fgsfdsfgs. :)

fgsfdsfgs commented 1 year ago

Yeah, it will crash on Wayland regardless. Fedora apparently has a patch in its SDL2 package that makes it default to the wayland driver when you have Wayland installed. On Debian it will always default to x11. As for the libGL thing, yeah, it probably requires a 32-bit libGL since the executable is 32-bit. Your GPU driver should probably include one, and if it doesn't, then mesa-libGL (or maybe glvnd) is the only option.

ghost commented 1 year ago

I spoke to soon! If I click on the game window to play, the game crashes again:

[rms@jupiter pd-i686-linux]$ SDL_VIDEODRIVER=x11 ./pd version: 0942f61 (i686-linux) startup date: 02 Sep 2023 17:21:00 ERROR: SDL audio init error: dsp: No such audio device ROM file: pd.ntsc-final.z64 loading segment fontjpnsingle from ROM (offset 00194b20 pointer 0xd7292b30) loading segment fontjpnmulti from ROM (offset 0019fb40 pointer 0xd729db50) loading segment animations from ROM (offset 001a15c0 pointer 0xd729f5d0) loading segment mpconfigs from ROM (offset 007d0a40 pointer 0xd78cea50) loading segment mpstringsE from ROM (offset 007d1c20 pointer 0xd78cfc30) loading segment mpstringsJ from ROM (offset 007d5320 pointer 0xd78d3330) loading segment mpstringsP from ROM (offset 007d8a20 pointer 0xd78d6a30) loading segment mpstringsG from ROM (offset 007dc120 pointer 0xd78da130) loading segment mpstringsF from ROM (offset 007df820 pointer 0xd78dd830) loading segment mpstringsS from ROM (offset 007e2f20 pointer 0xd78e0f30) loading segment mpstringsI from ROM (offset 007e6620 pointer 0xd78e4630) loading segment firingrange from ROM (offset 007e9d20 pointer 0xd78e7d30) loading segment fonttahoma from ROM (offset 007f7860 pointer 0xd78f5870) loading segment fontnumeric from ROM (offset 007f8b20 pointer 0xd78f6b30) loading segment fonthandelgothicsm from ROM (offset 007f9d30 pointer 0xd78f7d40) loading segment fonthandelgothicxs from ROM (offset 007fbfb0 pointer 0xd78f9fc0) loading segment fonthandelgothicmd from ROM (offset 007fdd80 pointer 0xd78fbd90) loading segment fonthandelgothiclg from ROM (offset 008008e0 pointer 0xd78fe8f0) loading segment sfxctl from ROM (offset 0080a250 pointer 0xd7908260) loading segment sfxtbl from ROM (offset 00839dd0 pointer 0xd7937de0) loading segment seqctl from ROM (offset 00cfbf30 pointer 0xd7df9f40) loading segment seqtbl from ROM (offset 00d05f90 pointer 0xd7e03fa0) loading segment sequences from ROM (offset 00e82000 pointer 0xd7f80010) loading segment texturesdata from ROM (offset 01d65f40 pointer 0xd8e63f50) loading segment textureslist from ROM (offset 01ff7ca0 pointer 0xd90f5cb0) romdataInit: loaded rom, size = 33554432 memp heap at 0xd60fd010 - 0xd70fd010 rom file at 0xd70fe010 - 0xd90fe010 ERROR: FATAL: Crashed: PC=(nil) SIGNAL=11 ERROR: FATAL: Crash!

SIGNAL: 11 PC: ./pd(+0x17e459) [0x56759459]

BACKTRACE:

00: ./pd(+0x17e459) [0x56759459]

01: ./pd(+0x17e7f5) [0x567597f5]

02: linux-gate.so.1(__kernel_rt_sigreturn+0) [0xf7fc35b0]

ghost commented 1 year ago

Installed the 32 bit version of pipewire and it still crashes when the game window is clicked:

[rms@jupiter pd-i686-linux]$ SDL_VIDEODRIVER=x11 ./pd version: 0942f61 (i686-linux) startup date: 02 Sep 2023 17:26:48 ROM file: pd.ntsc-final.z64 loading segment fontjpnsingle from ROM (offset 00194b20 pointer 0xcee93b30) loading segment fontjpnmulti from ROM (offset 0019fb40 pointer 0xcee9eb50) loading segment animations from ROM (offset 001a15c0 pointer 0xceea05d0) loading segment mpconfigs from ROM (offset 007d0a40 pointer 0xcf4cfa50) loading segment mpstringsE from ROM (offset 007d1c20 pointer 0xcf4d0c30) loading segment mpstringsJ from ROM (offset 007d5320 pointer 0xcf4d4330) loading segment mpstringsP from ROM (offset 007d8a20 pointer 0xcf4d7a30) loading segment mpstringsG from ROM (offset 007dc120 pointer 0xcf4db130) loading segment mpstringsF from ROM (offset 007df820 pointer 0xcf4de830) loading segment mpstringsS from ROM (offset 007e2f20 pointer 0xcf4e1f30) loading segment mpstringsI from ROM (offset 007e6620 pointer 0xcf4e5630) loading segment firingrange from ROM (offset 007e9d20 pointer 0xcf4e8d30) loading segment fonttahoma from ROM (offset 007f7860 pointer 0xcf4f6870) loading segment fontnumeric from ROM (offset 007f8b20 pointer 0xcf4f7b30) loading segment fonthandelgothicsm from ROM (offset 007f9d30 pointer 0xcf4f8d40) loading segment fonthandelgothicxs from ROM (offset 007fbfb0 pointer 0xcf4fafc0) loading segment fonthandelgothicmd from ROM (offset 007fdd80 pointer 0xcf4fcd90) loading segment fonthandelgothiclg from ROM (offset 008008e0 pointer 0xcf4ff8f0) loading segment sfxctl from ROM (offset 0080a250 pointer 0xcf509260) loading segment sfxtbl from ROM (offset 00839dd0 pointer 0xcf538de0) loading segment seqctl from ROM (offset 00cfbf30 pointer 0xcf9faf40) loading segment seqtbl from ROM (offset 00d05f90 pointer 0xcfa04fa0) loading segment sequences from ROM (offset 00e82000 pointer 0xcfb81010) loading segment texturesdata from ROM (offset 01d65f40 pointer 0xd0a64f50) loading segment textureslist from ROM (offset 01ff7ca0 pointer 0xd0cf6cb0) romdataInit: loaded rom, size = 33554432 memp heap at 0xcdcfe010 - 0xcecfe010 rom file at 0xcecff010 - 0xd0cff010 ERROR: FATAL: Crashed: PC=(nil) SIGNAL=11 ERROR: FATAL: Crash!

SIGNAL: 11 PC: ./pd(+0x17e459) [0x5679d459]

BACKTRACE:

00: ./pd(+0x17e459) [0x5679d459]

01: ./pd(+0x17e7f5) [0x5679d7f5]

02: linux-gate.so.1(__kernel_rt_sigreturn+0) [0xf7fab5b0]

Segmentation fault (core dumped)

fgsfdsfgs commented 1 year ago

On Wayland it crashes in SDL_SetRelativeMouseMode, which is called when you click into the window. I'm afraid the only way I can fix this is by disabling mouse capture on Wayland, which will break mouselook. For now you can set MouseEnabled=0 in pd.ini, which will completely disable mouse. This also happened to @RyanDwyer before. Not sure if he figured out how to fix it or not.

Unrelated, but man, the crash handler is kind of useless on your system for some reason. The handler itself is the only thing that shows up in the backtrace. I'll need to figure out how to write a better one I guess.

ghost commented 1 year ago

I can confirm that the game also crashes when the game window is clicked, with MouseEnabled=0 in pd.ini. Never mind, maybe another day!

Very grateful for your time, assistance, and work @fgsfdsfgs :D

fgsfdsfgs commented 1 year ago

Thanks for testing. I'll see what I can do about this.

RyanDwyer commented 1 year ago

I'm on Arch with Sway (wayland). It works fine for me, including mouse capture, after I installed these packages:

Installing the above also installed a heap of lib32 dependencies, including:

Then I run it by executing pd.exe without passing any environment variables.

ghost commented 1 year ago

I hope you don't mind but I am going to engage in some armchair development for a moment.

Looking through the code, I notice that SDL is handing the window management and Fast3D is handling the OpenGL renderer. I strongly suspect that this is the crux of the problem on Fedora. My rationale is:

From the Fast3D Github readme

Implementation of a Fast3D renderer for games built originally for the Nintendo 64 platform. For rendering OpenGL, Direct3D 11 and Direct3D 12 are supported. Supported windowing systems are GLX (used on Linux), DXGI (used on Windows) and SDL (generic). (emphasis mine)

GLX is/was the OpenGL extension specifically for the Xorg display server (aka X11). Fast3D is (on Linux) programmed to interface with X11 unless told to interface with SDL.

The Wayland display protocol is not natively compatible with X11, and so the xWayland compatibility layer exists as a band aid to allow 100% X11 programs to run until they have been ported over.

The issue the report shows is that SDL is creating a Wayland-native window context, and Fast3D is trying to draw to it via X11. The two are incompatible, hence the crashing and hence why setting SDL_VIDEODRIVER=x11 worked (except for input).

The solutions seems to me to be:

I could try and hack something together to try satisfy the first option but I really am not proficient in C and generally have no idea what I am doing with a project like this, so it'll likely be riddled with UB and overflows.

fgsfdsfgs commented 1 year ago

Have SDL handle both the rendering as well as the window creation and the input. This has the benefit of making this game automatically Wayland and X11 compatible (and cross-platform), and automatically OpenGL/Vulkan/Metal/DX/DX12 compatible.

SDL is also responsible for providing the GL context and GL function pointers, so this already is the case. Raw GLX is not used in this fork of Fast3D, the README is correct only for that particular repository. Unless you mean using the actual SDL_Renderer, which is a very simple thing meant mostly for 2D rendering.

The crashes happen inside SDL's Wayland driver and not in any GLX-related part of it, so this is all mostly irrelevant anyway. As far as I can tell, the cause of them is that the driver state pointer is NULL in some places where that code does not check whether it is NULL. I don't know why that happens and I lack familiarity with Wayland to investigate it in more detail, but it seems like the only option is to patch this in SDL.

Considering Ryan's post above, it might also be because said driver loads some libraries dynamically and the crashes only happen if they are not installed on your system. In that case the solution is just to install them.

ghost commented 1 year ago

Then my apologies, I came to an incorrect conclusion via a false assumption about Fast3D; that it was the same as the other repository. Thank you for being patient. :)

ghost commented 1 year ago

This SDL bug report might be relevant to this issue. If so, it looks like SDL are aiming to fix it for version 3.2.0.

https://github.com/libsdl-org/SDL/issues/7650

ghost commented 1 year ago

Updated to the newly released Fedora 39 but I am still not able to run the game. The issue is still identical to the above.

Fedora 39 has SDL 2.26.5-2 vs Fedora 38's SDL2 2.26.3-1; only a small patch revision, so I am not surprised that the same issue persists.

On the bright side, I've at last been able to build this port from source using the attached slightly modified makefile. Under the "assume *nix" section I have replaced the pkg-config invocation with sdl2-config. It may perhaps work better for other distros too (untested). Makefile.zip