libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.92k stars 1.84k forks source link

SDL_PollEvent segfaults on Apple Silicon (except in debugger) #5400

Closed BryanHaley closed 2 years ago

BryanHaley commented 2 years ago

Issue:

Calling SDL_PollEvent causes a segmentation fault on macOS 12.0 (Apple Silicon). When the application is run under lldb, no segfault will occur.

SDL Version:

Tested on 2.0.18, 2.0.20, and 2.0.20 at HEAD. On 2.0.16 a segfault will occur, but at a different unknown location. No segfault occurs on 2.0.16 when running the application under lldb either. All versions installed using brew. SDL2.framework distribution was unable to run due to codesigning issues.

To reproduce:

brew install pkg-config sdl2 libpng libjpeg libvorbis freetype speex speexdsp opus
git clone https://github.com/BryanHaley/fteqw-applesilicon.git
cd fteqw-applesilicon/applesilicon
make
./fteqw

As mentioned running under lldb will result in no segfault occuring:

lldb ./fteqw
run

Notes:

Issue occurs at this location: https://github.com/BryanHaley/fteqw-applesilicon/blob/2223782ae52641c63d5f4386594634db6380358e/engine/client/in_sdl.c#L972

The segfault happens inside of SDL_PollEvent itself. However, without the debugger, it is difficult to diagnose exactly where. fteqw runs successfully on other platforms with this code, and I don't see any overly obvious issues, so I'm not sure why the Apple Silicon version of SDL2 is giving me a problem here. SDL_PollEvent is called 7 times before crashing on the 8th at a later point in startup.

Screen Shot 2022-03-13 at 1 46 51 AM

Using lldb I can see the value of event.type at the point it would crash is 512. Not sure if this information is useful.

Screen Shot 2022-03-13 at 1 52 01 AM
BryanHaley commented 2 years ago

Update: Calling SDL_PumpEvents immediately before SDL_PollEvent causes the segfault to happen in a different location (similar to 2.0.16?). Haven't pinpointed where exactly yet.

BryanHaley commented 2 years ago

Compiling and linking with -fsanitize=address shows the following info when attempting to run:

bryan@bryans-MacBook-Pro:~/Projects/fteqw-applesilicon/applesilicon
 $ ./fteqw
Host_Init
couldn't exec config.cfg
couldn't exec fte.cfg

Engine Version: FTE build Mar 13 2022
Setting windowed mode 640*480 OpenGL
GL_VENDOR: Apple
GL_RENDERER: Apple M1 Max
GL_VERSION: 2.1 Metal - 75.19
AddressSanitizer:DEADLYSIGNAL
=================================================================
==29600==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x0001d0a4dd50 sp 0x00016f8addc0 T0)
==29600==Hint: pc points to the zero page.
==29600==The signal is caused by a UNKNOWN memory access.
==29600==Hint: address points to the zero page.
    #0 0x0  (<unknown module>)

==29600==Register values:
 x[0] = 0x000000010909d700   x[1] = 0x00000001d0aadef4   x[2] = 0x000000016f8adf94   x[3] = 0x000000016f8ae0ec
 x[4] = 0x000000016f8ae030   x[5] = 0x0000000000000000   x[6] = 0x000000016f0c8000   x[7] = 0x0000000000000001
 x[8] = 0x0000000000000001   x[9] = 0x00000002221bb920  x[10] = 0x0000000000000001  x[11] = 0x00000001c7d458b0
x[12] = 0x0000000000000005  x[13] = 0x000000011ae06620  x[14] = 0x00000001c7dc45e2  x[15] = 0x0000000220535478
x[16] = 0x0000000000000000  x[17] = 0x00000002221fc970  x[18] = 0x0000000111c433fc  x[19] = 0x0000000000000000
x[20] = 0x00000000ffffffff  x[21] = 0x000000010909e360  x[22] = 0x000000016f8ae0ec  x[23] = 0x000000011ac09190
x[24] = 0x000000010909d700  x[25] = 0x00000000000000a4  x[26] = 0x0000000211be1000  x[27] = 0x0000000000000001
x[28] = 0x0000000000000003     fp = 0x000000016f8adff0     lr = 0x00000001d0a4dd50     sp = 0x000000016f8addc0
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (<unknown module>)
==29600==ABORTING
Abort trap: 6

Attempted to run under Mac Instruments to get a stack trace. Crash does not occur when running under instruments.

icculus commented 2 years ago

I'm not seeing this problem here with other SDL apps on an M1. Is it possible fteqw is corrupting something?

BryanHaley commented 2 years ago

Possible, but difficult to tell given I haven't even been able to get a stack trace out of it. Everything regarding SDL happens on the same thread and even if I pass -DNO_MULTITHREADING the issue still occurs. I don't see any obvious mishandling of SDL that would cause this. Are you able to reproduce the issue on your end?

arichmondphoto commented 2 years ago

I adjusted the makefile to built this on an Intel iMac running Monterey 12.1 and had no issues. I did have to install SDL2 from MacPorts because it wouldn’t link to sdl2-config from the Homebrew installation of SDL2. Not sure if that is related (probably not).

FtZPetruska commented 2 years ago

Hi, I wasn't able to reproduce your issue on macOS 12.3 running on an M1 chip using Homebrew's SDL 2.0.20 build.

I was able to launch the game and get in a level without any issue.

Here are my logs: ``` % ./fteqw -basedir ../quake Host_Init Playing registered version. couldn't exec config.cfg couldn't exec fte.cfg couldn't exec autoexec.cfg Engine Version: FTE build Mar 17 2022 Setting fullscreen windowed OpenGL GL_VENDOR: Apple GL_RENDERER: Apple M1 Max GL_VERSION: 2.1 Metal - 76.3 Initing default SDL audio device. OpenGL renderer initialized ------- Quake Initialized ------- client Player connected ------------------- Introduction Player entered the game Changing map... ------------------- Termination Central Player entered the game You got the shells You got the Double-barrelled Shotgun Server ended Client "Player" removed ```
BryanHaley commented 2 years ago

Hmmm interesting. Guess it's just on my end? I'll investigate more and reopen if necessary.

slime73 commented 2 years ago

I haven't tested your app, but my own stuff uses SDL without issue on Apple Silicon.

BryanHaley commented 2 years ago

Yeah, I tested in a fresh virtual machine and didn't run into issues. Not sure what on earth is messed up on my local machine, but it seems to be specific to me.

tinogoehlert commented 2 years ago

hi @BryanHaley

i currently run into the same problem on a M1 air running a custom darkplaces engine. So it's not specific to you :-)