libsdl-org / sdl12-compat

An SDL-1.2 compatibility layer that uses SDL 2.0 behind the scenes.
Other
193 stars 40 forks source link

tuxmath segfaults at startup #289

Closed musuruan closed 1 year ago

musuruan commented 1 year ago

Fedora introduced sdl12-compat in F35. Since then tuxmath segfaults at startup. It might be a problem in sdl12-compat.

Relevant downstream bug report: https://bugzilla.redhat.com/show_bug.cgi?id=2031642

I am currently on F37 with sdl12-compat v1.2.60-1.fc37 installed from RPM.

I made a bit of debugging and it seems it crashes here when i=4: https://github.com/tux4kids/tuxmath/blob/upstream/2.0.3/src/fileops_media.c#L271

Backtrace:

Thread 1 "tuxmath" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, 
    no_tid=no_tid@entry=0) at pthread_kill.c:44
44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation
    (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at pthread_kill.c:44
#1  0x00007ffff7cc4ec3 in __pthread_kill_internal
    (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007ffff7c74a76 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/posix/raise.c:26
#3  0x00007ffff7c5e7fc in __GI_abort () at abort.c:79
#4  0x00007ffff7cb908e in __libc_message
    (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7dd2465 "%s\n")
    at ../sysdeps/posix/libc_fatal.c:155
#5  0x00007ffff7cceb9c in malloc_printerr
    (str=str@entry=0x7ffff7dd4fe8 "free(): double free detected in tcache 2")
    at malloc.c:5660
#6  0x00007ffff7cd0ee6 in _int_free
    (av=0x7ffff7e0bc80 <main_arena>, p=0x555556099e20, have_lock=0)
    at malloc.c:4469
#7  0x00007ffff7cd3363 in __GI___libc_free (mem=mem@entry=0x555556099e30)
    at malloc.c:3385
#8  0x00007ffff7caf3e7 in _IO_deallocate_file (fp=0x555556099e30)
    at /usr/src/debug/glibc-2.36-9.fc37.x86_64/libio/libioP.h:862
#9  _IO_new_fclose (fp=0x555556099e30) at iofclose.c:74
#10 0x00007ffff5ddbbb9 in stdio_close (context=0x555556099410)
--Type <RET> for more, q to quit, c to continue without paging--
    at /usr/src/debug/SDL2-2.26.3-1.fc37.x86_64/src/file/SDL_rwops.c:439
#11 0x00007ffff7ea1697 in RWops20to12_close (rwops12=0x5555560575d0)
    at /usr/src/debug/sdl12-compat-1.2.60-1.fc37.x86_64/src/SDL12_compat.c:8088
#12 0x00007ffff7e27e9e in Mix_LoadWAV_RW
    (src=0x5555560575d0, freesrc=freesrc@entry=1)
    at /usr/src/debug/SDL_mixer-1.2.12-25.fc37.x86_64/mixer.c:565
#13 0x000055555555bd3b in load_sound_data ()
    at /usr/src/debug/tuxmath-2.0.3-13.fc37.x86_64/src/fileops_media.c:271
#14 load_data_files ()
    at /usr/src/debug/tuxmath-2.0.3-13.fc37.x86_64/src/setup.c:750
#15 setup (argv=<optimized out>, argc=<optimized out>)
    at /usr/src/debug/tuxmath-2.0.3-13.fc37.x86_64/src/setup.c:139
#16 main (argc=<optimized out>, argv=<optimized out>)
    at /usr/src/debug/tuxmath-2.0.3-13.fc37.x86_64/src/tuxmath.c:40

If you disable sound by running tuxmath with the --nosound option, it runs fine because this section is avoided.

Can you please have a look?

Thanks!

icculus commented 1 year ago

Can you do me a favor? Run tuxmath with the environment variable SDL12COMPAT_DEBUG_LOGGING=1 set.

It'll print something like this out at startup:

INFO: sdl12-compat 1.2.61, built on May 16 2023 at 14:30:00, talking to SDL2 2.27.0

Can you tell me what your computer prints there?

Because I'm not crashing (and valgrind isn't reporting anything that looks like a double-free error), but it is saying this:

Error: I couldn't load a sound file:
/usr/share/tuxmath/sounds/buzz.wav
The Simple DirectMedia error that occured was:
Invalid block alignment

And I'm wondering if this is the difference between crashing and not crashing right here.

icculus commented 1 year ago

Actually, nevermind, I can see all the version numbers in that stack trace, which is pretty nice! :)

icculus commented 1 year ago

And I'm wondering if this is the difference between crashing and not crashing right here.

Also, this is definitely the difference, since the for-loop exits if it hits this condition. Let me mess with this more on my end.

icculus commented 1 year ago

(First off, wow, I installed Fedora 38 in a virtual machine and "yum install tuxmath" and just for a goof I ran it under gdb...I really did not expect the system to download debug symbols for everything, and as I stepped through the code, download individual source files for various libraries on the fly for viewing. I don't know if this is a Fedora thing or a general newer Linux thing, but that was amazing.)

So here is where this is exploding, in SDL_mixer's mixer.c:

616     if ( !loaded ) {
617         SDL_free(chunk);
618         if ( freesrc ) {
619             SDL_RWclose(src);
620         }
621         return(NULL);
622     }
623 

Item 4 in the previously-mentioned for-loop fails to load for whatever reason (probably because it reports its block alignment as zero and SDL2 doesn't like that...we should maybe just force the alignment to 1 in these cases?). Fail or not, SDL_LoadWAV_RW, which SDL_mixer calls to load the .wav file, will call SDL_RWclose(src).

But only when failing, as item 4 does, does SDL_mixer also try to call SDL_RWclose, causing a double-free.

The problem is this: that code was fixed in SDL_mixer...12 years ago.

https://github.com/libsdl-org/SDL_mixer/commit/23b6289224edb24fcb8266fce9647822b69c4298

But since this is on the SDL-1.2 branch, it has never had a formal release, so it's not in the ancient SDL_mixer 1.2.12 that distros ship because it was the final official release before we moved to SDL2-only releases.

So this is not an sdl12-compat bug, and you may hit the same crash if you used classic SDL 1.2 in this scenario. I didn't hit it on my system originally because I built SDL_mixer from the latest source code for the SDL-1.2 branch on GitHub.

The most expedient course for Fedora would be to add that specific commit as a custom patch in the SDL_mixer 1.2 package, since it's a generally good, small fix and it specifically would fix the currently-broken tuxmath package.

As a secondary option, I'm going to look into why that .wav file is failing to load, and if we apply a fix to SDL2, then when Fedora picks up the upcoming 2.28.0 release, sdl12-compat will use it and this won't have a chance to double-free because the .wav won't fail to load. But this is not a good primary solution because all you need is one of those .wav files to get a little corrupted on the filesystem and fail to load, and you'll still crash, unless the SDL_mixer patch is applied, too.

(CC @Conan-Kudo, in case he finds this to be interesting information.)

Conan-Kudo commented 1 year ago

(First off, wow, I installed Fedora 38 in a virtual machine and "yum install tuxmath" and just for a goof I ran it under gdb...I really did not expect the system to download debug symbols for everything, and as I stepped through the code, download individual source files for various libraries on the fly for viewing. I don't know if this is a Fedora thing or a general newer Linux thing, but that was amazing.)

This is something that has been available since Fedora Linux 35. Since Fedora Linux 38, you can do performance analysis using perf of everything built in the distribution.

musuruan commented 1 year ago

So here is where this is exploding, in SDL_mixer's mixer.c:

616       if ( !loaded ) {
617           SDL_free(chunk);
618           if ( freesrc ) {
619               SDL_RWclose(src);
620           }
621           return(NULL);
622       }
623   

Item 4 in the previously-mentioned for-loop fails to load for whatever reason (probably because it reports its block alignment as zero and SDL2 doesn't like that...we should maybe just force the alignment to 1 in these cases?). Fail or not, SDL_LoadWAV_RW, which SDL_mixer calls to load the .wav file, will call SDL_RWclose(src).

But only when failing, as item 4 does, does SDL_mixer also try to call SDL_RWclose, causing a double-free.

Thank you very very much for this!

The most expedient course for Fedora would be to add that specific commit as a custom patch in the SDL_mixer 1.2 package, since it's a generally good, small fix and it specifically would fix the currently-broken tuxmath package.

Neal (thanks!) has already updated and reassigned the original downstream bug report and new packages are coming to updates-testing.

Thanks again!

icculus commented 1 year ago

(Thanks everyone; Fedora's team is on top of things, as usual, and I'm always grateful for it!)

As a quick followup, the SDL2 fix I mentioned is now in revision control and will be in the 2.28.0 release that we're currently working to finish up: https://github.com/libsdl-org/SDL/issues/7714

But again, the SDL_mixer patch that already made its way into Fedora is still the most correct fix here, so no further action is required.