azihassan / devilutionX

Diablo build for modern operating systems
Other
4 stars 1 forks source link

Crash in palette_update code #14

Open azihassan opened 2 weeks ago

azihassan commented 2 weeks ago

Recent builds cause the game to crash after the splash screen. The culprit seems to be either palette_update or the underlying SDLC_SetSurfaceAndPaletteColors function.

In the previous release, SDLC_SetSurfaceAndPaletteColors was causing an unrelated error (failure to read an mpq from /vmu). The return SDL_SetPalette(surface, SDL_LOGPAL, colors, firstcolor, ncolors) - 1; was returning -1 and triggering the < 0 check in palette_update:

    if (SDLC_SetSurfaceAndPaletteColors(PalSurface, Palette.get(), system_palette.data(), first, ncolor) < 0) {
        ErrSdl(); //<<< here
    }

I'm assuming that ErrSdl displays the most recent error, which was caused by a normal MPQ search path anomaly (no MPQs on the VMU). But it gets triggered because SDL_SetPalette probably returned 0 and didn't change whatever error state ErrSdl pulls errors from. Removing - 1 caused the game to crash (or freeze, I don't remember) on real hardware, and skipping the SDL_SetPalette call altogether allowed it to run normally. I'm assuming it's because the SDL_SetColors call above it already updates the palette internally when 8 bpp and SDL_HWPALETTE are enabled (emphasis mine):

Palettized (8-bit) screen surfaces with the SDL_HWPALETTE flag have two palettes, a logical palette that is used for mapping blits to/from the surface and a physical palette (that determines how the hardware will map the colors to the display). SDL_SetColors modifies both palettes (if present), and is equivalent to calling SDL_SetPalette with the flags set to (SDL_LOGPAL | SDL_PHYSPAL).

However with the fork and SDL port updated to the latest version, this no longer works.

azihassan commented 2 weeks ago

Thanks @glebm. I tested it but the issue is very flaky. Sometimes it works when there are logs and sometimes not, it hangs on Flycast but reboots the console, etc.

I found a similar issue that was solved by disabling LTO. Doing it with set(DISABLE_LTO ON) was not enough, so I specifically requested it by manually setting -flto=none. I'm gonna have to recompile in verbose mode to see what's going on