libsdl-org / SDL_ttf

Support for TrueType (.ttf) font files with Simple Directmedia Layer.
zlib License
343 stars 116 forks source link

Unaligned access UB in BLIT_GLYPH routines #324

Closed matoro closed 5 months ago

matoro commented 6 months ago

Hi, I identified this bug while testing pygame: https://bugs.gentoo.org/920956

In the blit glyph routines BG_64 and BG_32, the loops invoke undefined behavior by casting some offset into a larger integer to u8. GCC UBSAN at least catches and warns about this. These problems are easy to catch on sparc, because unaligned access is completely illegal there and crashes the program with a SIGBUS, but this is not a platform-specific problem: unaligned access is undefined behavior on all platforms.

Minimized reproducer:

$ wget 'https://github.com/pygame/pygame/raw/main/src_py/freesansbold.ttf'
...
$ cat test.c
#include <SDL2/SDL_ttf.h>

int main()
{
        TTF_Init();
        TTF_RenderUTF8_Solid(
                TTF_OpenFont("./freesansbold.ttf", 24),
                "Test",
                (struct SDL_Color){ 255, 255, 255, 255 }
        );
        return 0;
}
$ gcc -ggdb3 -fsanitize=undefined test.c -o test -lSDL2_ttf
$ ./test
/var/tmp/portage/media-libs/sdl2-ttf-2.20.2/work/SDL2_ttf-2.20.2/SDL_ttf.c:919:9: runtime error: load of misaligned address 0x0001272a9817 for type 'const Uint64', which requires 8 byte alignment
0x0001272a9817: note: pointer points here
 00 00 00 00 01  01 01 01 01 01 01 01 01  01 01 01 01 01 00 00 00  00 00 00 00 01 01 01 01  01 01 01
             ^ 

And this is the complete backtrace at the point of the crash on sparc:

(gdb) frame 0
#0  0xfff80001003039c4 in BG_64 (image=0x10000209848, destination=0x1000021ab50 "", srcskip=5, dstskip=40)
    at /usr/src/debug/media-libs/sdl2-ttf-2.20.2/SDL2_ttf-2.20.2/SDL_ttf.c:919
919             DUFFS_LOOP4(
(gdb) l
914         Uint32        width  = image->width / 8;
915         Uint32        height = image->rows;
916
917         while (height--) {
918             /* *INDENT-OFF* */
919             DUFFS_LOOP4(
920                 *dst++ |= *src++;
921             , width);
922             /* *INDENT-ON* */
923             src = (const Uint64 *)((const Uint8 *)src + srcskip);
(gdb) bt full
#0  0xfff80001003039c4 in BG_64 (image=0x10000209848, destination=0x1000021ab50 "", srcskip=5, dstskip=40) at /usr/src/debug/media-libs/sdl2-ttf-2.20.2/SDL2_ttf-2.20.2/SDL_ttf.c:919
        n = 2
        src = 0x1000021980f
        dst = 0x1000021ab50
        width = 2
        height = 16
#1  0xfff80001003055a4 in Render_Line_64_Solid (font=0x100002081b0, textbuf=0x100002196d0, xstart=0, ystart=0, fg=0x0) at /usr/src/debug/media-libs/sdl2-ttf-2.20.2/SDL2_ttf-2.20.2/SDL_ttf.c:1271
        above_w = -33
        srcskip = 5
        dst = 0x1000021ab50 ""
        remainder = 0
        saved_buffer = 0x10000219800 ""
        above_h = -4
        dstskip = 40
        saved_width = 14
        idx = 55
        x = 0
        y = 3
        image = 0x10000209848
        alignment = 7
        bpp = 1
        i = 0
        fg_alpha = 0 '\000'
#2  0xfff80001003095c0 in Render_Line (render_mode=RENDER_SOLID, subpixel=0, font=0x100002081b0, textbuf=0x100002196d0, xstart=0, ystart=0, fg=...) at /usr/src/debug/media-libs/sdl2-ttf-2.20.2/SDL2_ttf-2.20.2/SDL_ttf.c:1372
        is_opaque = 1
#3  0xfff800010031033c in TTF_Render_Internal (font=0x100002081b0, text=0x10000000930 "Test", str_type=STR_UTF8, fg=..., bg=..., render_mode=RENDER_SOLID) at /usr/src/debug/media-libs/sdl2-ttf-2.20.2/SDL2_ttf-2.20.2/SDL_ttf.c:3503
        color = 1
        xstart = 0
        ystart = 0
        width = 47
        height = 24
        textbuf = 0x100002196d0
        utf8_alloc = 0x0
#4  0xfff800010031056c in TTF_RenderUTF8_Solid (font=0x100002081b0, text=0x10000000930 "Test", fg=...) at /usr/src/debug/media-libs/sdl2-ttf-2.20.2/SDL2_ttf-2.20.2/SDL_ttf.c:3537
No locals.
#5  0x00000100000008ec in main () at test.c:6
No locals.

Tested against release-2.20.2. Let me know if any further information is helpful. I also offer free shell access to the hardware I used to reproduce this issue on, if it's useful. Thank you!

1bsyl commented 6 months ago

If unaligned access are illegal on sparc arch, We should unactive the defined in SDL_ttf.c HAVE_BLIT_GLYPH_64 should be undefined. Probably also HAVE_BLIT_GLYPH_32

matoro commented 6 months ago

If unaligned access are illegal on sparc arch, We should unactive the defined in SDL_ttf.c HAVE_BLIT_GLYPH_64 should be undefined. Probably also HAVE_BLIT_GLYPH_32

No, that's wrong. Please don't do this. The behavior is undefined because it commits a strict-aliasing violation. This is wrong on all platforms, not just sparc.

1bsyl commented 6 months ago

I havent tried in detail but i dont see the strict aliasing violation here.https://github.com/libsdl-org/SDL_ttf/blob/303c2801322b03ee0ccdb652b70f237b229408b5/src/SDL_ttf.c#L880

It usually happens when you keep and use at the same time two pointers of different types but same adress. Which doesnt seem the case here.

Then your gcc error is actually misalignment:

"Load of misaligned address 0x0001272a9817 for type 'const Uint64'"

matoro commented 6 months ago

I havent tried in detail but i dont see the strict aliasing violation here.

https://github.com/libsdl-org/SDL_ttf/blob/303c2801322b03ee0ccdb652b70f237b229408b5/src/SDL_ttf.c#L880

It usually happens when you keep and use at the same time two pointers of different types but same adress. Which doesnt seem the case here.

Then your gcc error is actually misalignment:

"Load of misaligned address 0x0001272a9817 for type 'const Uint64'"

It's because you're indexing into an array of u8 and then casting it to a u64 without guaranteeing that the location you have indexed to is aligned for a u64. In this case you can see that srcskip=5. If src is aligned on a 8-byte boundary, then you turn around in https://github.com/libsdl-org/SDL_ttf/blob/303c2801322b03ee0ccdb652b70f237b229408b5/src/SDL_ttf.c#L893 and advance 5 bytes into it and reinterpret it back to an array of u64. You can't access this as a u64 at a 5-byte offset.

1bsyl commented 6 months ago

Yes so this is indeed an un aligned load. Not a strict aliasing violation. And this is allowed on most platform. And this is done on purpose to blit faster. glyph 8 by 8.

Apparently this isnt allowed on sparc, so we should desactivate it for this platform. (Only a matter of not activating this optimization with the previous defines).

matoro commented 6 months ago

Yes so this is indeed an un aligned load. Not a strict aliasing violation. And this is allowed on most platform. And this is done on purpose to blit faster. glyph 8 by 8.

Apparently this isnt allowed on sparc, so we should desactivate it for this platform. (Only a matter of not activating this optimization with the previous defines).

No, that's not right. Conversion to a misaligned pointer is itself invoking undefined behavior, technically even before it is accessed. See this answer and this answer.

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.

That UBSAN output that I provided is from a non-sparc platform. It's telling you that it is undefined behavior, which is not allowed by the C standard, even if it is allowed by the underlying hardware platform.

slouken commented 6 months ago

@matoro is correct here, we need to fix this. We shouldn't rely on unaligned load behavior, we should split it into unaligned and aligned accesses.

1bsyl commented 6 months ago

What about the assembly unaligned load? The c ones are out of standard. But the assembly one are correct? So we just turn this c routine into assembly that use un aligned load and that would be ok?

Anyway. Those routines need un-aligned load to run. Bg_64 are actually the less used because there are between the better sse/neon and the no optimized routine. So if that a real issue we can remove them. But i ran billions of string renderings x86/arm without crash so i am sceptical

slouken commented 6 months ago

Yeah, assembly ones are fine. Maybe we can just have the 32-bit versions and assembly? I haven't looked at the code, but 32-bit accesses also need to be aligned.

matoro commented 6 months ago

You can instead just replace it with a memcpy(). Compiler is smart enough to transform that appropriately.

Where strict aliasing prohibits examining the same memory as values of two different types, memcpy may be used to convert the values.

thesamesam commented 6 months ago

I'd expect the memcpy to be efficient with modern compilers and generate identical code where it's legal. Some projects make that the default with an opt-in to raw unaligned access as the project currently does (or the opposite but I'd not prefer that).

1bsyl commented 6 months ago

Here some PR (from smartphone...) not tested...

1bsyl commented 5 months ago

I gave a try and could not trigger the error on my x86_64 machine. Compiled both SDL_ttf and test program here, and also SDL_ttf/example/testapp.c

I used the flags: -fsanitize=undefined also for SDL_ttf: -Db_lundef=false // -z undefs (to avoid some "undefined symbol: __ubsan_handle_type_mismatch_v1")

any idea ?

If I simulate a fake invalid read: int tmp; if (tmp) SDL_Log("hello"); it would crash, so I believe ubsan is somehow running here.

but I didn't see the unaligned access, any idea ?

matoro commented 5 months ago

I gave a try and could not trigger the error on my x86_64 machine. Compiled both SDL_ttf and test program here, and also SDL_ttf/example/testapp.c

I used the flags: -fsanitize=undefined also for SDL_ttf: -Db_lundef=false // -z undefs (to avoid some "undefined symbol: __ubsan_handle_type_mismatch_v1")

any idea ?

If I simulate a fake invalid read: int tmp; if (tmp) SDL_Log("hello"); it would crash, so I believe ubsan is somehow running here.

but I didn't see the unaligned access, any idea ?

Did you download the relevant sample font? See the first wget command I shared.

1bsyl commented 5 months ago

Did you download the relevant sample font? See the first wget command I shared.

yes! also tried with testapp.c with runs a lots a random string. so it would just hit invalid unaligned somehow

matoro commented 5 months ago

Did you download the relevant sample font? See the first wget command I shared.

yes! also tried with testapp.c with runs a lots a random string. so it would just hit invalid unaligned somehow

@1bsyl Here are the exact commands to reproduce the problem:

git clone --tags "https://github.com/libsdl-org/SDL_ttf"
cd SDL_ttf
git checkout release-2.20.2
mkdir build
cd build
CFLAGS="-fsanitize=undefined" cmake ..
make -j$(nproc)
wget 'https://github.com/pygame/pygame/raw/main/src_py/freesansbold.ttf'
cat << EOF > test.c
#include <SDL_ttf.h>

int main()
{
        TTF_Init();
        TTF_RenderUTF8_Solid(
                TTF_OpenFont("./freesansbold.ttf", 24),
                "Test",
                (struct SDL_Color){ 255, 255, 255, 255 }
        );
        return 0;
}
EOF
gcc -fsanitize=undefined -I .. -I /usr/include/SDL2 test.c -o test -L . -lSDL2_ttf
LD_LIBRARY_PATH=. ./test

Pasting this exact block into a terminal reproduces the problem from scratch for me. Do you also see it with this sequence of commands?

1bsyl commented 5 months ago

@matoro Thanks, I can reproduce the issue now. (Just also have to uncomment the SSE detection, other it won't happen). I am going to update the PR.

1bsyl commented 5 months ago

Though, I still cannot run this on SDL3

1bsyl commented 5 months ago

here's a PR: https://github.com/libsdl-org/SDL_ttf/pull/329

1bsyl commented 5 months ago

@matoro please give a try the pr on sparc. also, testapp.c is better for more testing, it does fuzzing and tries many combination of renderering

btw, I also have ubsan reports like that:

SDL_ttf.c:514:9: runtime error: left shift of 134 by 24 places cannot be represented in type 'int' should I fix them ??

matoro commented 5 months ago

@matoro please give a try the pr on sparc. also, testapp.c is better for more testing, it does fuzzing and tries many combination of renderering

PR works, see my comment there.

btw, I also have ubsan reports like that:

SDL_ttf.c:514:9: runtime error: left shift of 134 by 24 places cannot be represented in type 'int' should I fix them ??

Yes, those are indeed also undefined behavior!

1bsyl commented 5 months ago

here's some fix for the left shift https://github.com/libsdl-org/SDL_ttf/pull/330/