SiegeLord / DAllegro5

D binding to the Allegro5 game development library
Other
42 stars 15 forks source link

al_get_pixel corrupts stack, Windows, 64-bit LDC #48

Closed SimonN closed 5 years ago

SimonN commented 5 years ago
  1. Windows 7 (but probably all Windowses)
  2. Install LDC 64-bit for Windows, version 1.13.0-beta1 or newer.
  3. Install to LDC's bin and lib dirs the NuGet Allegro 5 64-bit DLLs and LIBs.
  4. Write a test program that calls al_get_pixel. (Its source really be here in this issue, for convenience)
  5. Build test program with LDC 64-bit.
  6. Run test progam in a debugger and examine the crash: It's a corrupted stack.
[New Thread 4672.0x2e4] [New Thread 4672.0x5e0] [New Thread 4672.0x828] [New Thread 4672.0x1238] [New Thread 4672.0x13a4] 
Thread 1 received signal SIGSEGV, Segmentation fault.
0x000000013f6ee829 in ?? ()
(gdb) bt
#0  0x000000013f6ee829 in ?? ()
#1  0x3f5d172d00000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Speculation: DAllegro5 inserts extra assembler for MinGW (checked with a version statement) to fix MinGW 4.5's bizarre calling convention. Since LDC now relies on MinGW but apparently doesn't need that extra assembler, the extra assembler will corrupt the call stack instead of fix the call stack.

LDC 1.13.0-beta1 release notes LDC readme for this change to MinGW

Workaround: Remove all ColorWrapper mixins, instead declare all functions normally as @nogc nothrow extern(C). The downside of that solution is that it removes the assembler even on systems where (the hack around MinGW 4.5's calling convention) is required.

A better solution should instead have a version statement that doesn't insert the assembler with LDC.

Bonus issue (that should really be a separate issue, but is related): DAllegro's al_get_pixel are wrappers, not bindings, even on systems where no anti-bizarrity is needed. I'd like it to be only a binding wherever possible, then we don't have to link against an unneeded library allegro_color.

SimonN commented 5 years ago

Reproduced this on Linux with the Windows version of LDC 1.13.0-beta-1 in Wine. The crash is:

00df:err:seh:call_stack_handlers invalid frame 23f1bfffffffc0 (0x142000-0x240000)
00df:err:seh:NtRaiseException Exception frame is not in stack limits => unable to dispatch exception.
Program exited with code -1073741819

We're interested in versions for the version statement to apply anti-bizarrity only where necessary. Both on Windows and on Linux/Wine, LDC 1.13.0-beta-1 has the following versions set/not set:

SimonN commented 5 years ago

Most likely, this issue is not a bug in DAllegro5, but my own failure: For the Win64 LDC build, I'm using the 64-bit Allegro DLLs from Nuget that I had not used before. Maybe these 64-bit A5 DLLs have the MSVC calling convention and my old 32-bit DLLs had the MinGW calling convention.

I've begun to pass the D version ALLEGRO_MSVC when building my game on 64-bit Windows and that works with my 64-bit DLLs: Win64 game builds and doesn't crash with DAllegro5 current master fc422dbce.

Minor unhappiness remains:

SiegeLord commented 5 years ago

Ok, that makes sense. What I'm going to do is reverse the wrapping logic to require a version for MinGW 4.5 workaround, so you won't need to pass any version to get it to work by default.

Thanks for your investigation!

SimonN commented 5 years ago

Closing the original issue (corrupt stack) is correct.


Re reversal of the wrapping logic: That would indeed match expectations; the normal behavior does nothing out of the ordinary and any odd workarounds are opt-in.

But take care with backwards-compatibility: Such a reversal can break the build of 32-bit Windows games that rely on the MinGW calling convention, but whose dub.json does not specify any version flags. And DMD 32-bit is a common choice on Windows.

Maybe 64-bit Windows builds should never get the calling convention workaround? I have no idea what's common here. On the other hand, let's keep the versioning logic simple...