Closed avaxar closed 6 months ago
I will try your example later to see what happens on my machine. I usually use static bindings so sometimes I miss these things. In the meantime, could you please provide a stack trace? (e.g. by using gdb) Even if I experience a segmentation fault there’s no guarantee it’s caused by the same thing, since it’s such a generic error.
Thread 1 (Thread 0x7ffff7c7fdc0 (LWP 471) "testd"):
#0 0x0000555555612ec0 in SDL_snprintf ()
#1 0x00007ffff70414d7 in load_music_type () at /lib/x86_64-linux-gnu/libSDL2_mixer.so
#2 0x00007ffff70418fa in Mix_Init () at /lib/x86_64-linux-gnu/libSDL2_mixer.so
#3 0x00005555555b16a0 in Mix_Init (flags=16) at ../.dub/packages/bindbc-sdl/1.4.5/bindbc-sdl/source/sdl_mixer.d-mixin-191:270
#4 0x00005555555ae764 in _Dmain () at source/app.d:6
This issue is very odd. It only occurs with SDL Mixer, not SDL TTF, SDL Net, or SDL Image. I'm able to reproduce it on a Debian machine, but it's a bit hard to tell if it's a problem with the bindings at all. As for your hypothesis that it's using the wrong SDL internally, loading different versions of SDL/SDL Mixer manually by supplying a string to loadSDL
/loadSDLMixer
didn't help in my case.
I had very similar problem very recently, so I looked into bindbc code and found cause of this issue. Basically bindbc-common overwrites all symbols in library with proto functions which then call function pointer loaded from library. All symbols dynamically linked by libraries (e.g. SDL_Image loading SDL_SetError symbol) link to this proto functions. Problem happens when function is variadic in which case bindbc-common codegen instead of making proto function just overwrites function symbol with function pointer. It work's from D code, which knows that it's pointer. But other libraries links to pointer while expecting function there. This way every library which depends on SDL would crash trying to call wrong address.
Example of code generated by bindbc-common templates:
alias _pSDL_SetError = extern(C) void function(const(char)* fmt, ...
alias _pSDL_GetError = extern(C) const(char)* function();
extern(C) _pSDL_SetError SDL_SetError; // overwritten symbol with function pointer - crash when called from SDL_xxx
package _pSDL_GetError _SDL_GetError;
extern(C) const(char)* SDL_GetError(){ return _SDL_GetError(__traits(parameters)); // proto function works fine
I've tested my suspicions by changing mangleof
to stringof
in bindbc-common and removing extern(C)
for variadic functions and everything works fine.
This problem might be Linux specific due to it's linking mechanism. On Linux all symbols are exported by default which is not the case e.g. for Windows. (I haven't worked on Windows for a very long time, so I'm not sure if it can happend on Windows in some circumstances. Theoretically it's possible, but it depends on how Windows resolves symbols exported multiple times from different libraries)
Example of code generated by bindbc-common templates:
alias _pSDL_SetError = extern(C) void function(const(char)* fmt, ... alias _pSDL_GetError = extern(C) const(char)* function(); extern(C) _pSDL_SetError SDL_SetError; // overwritten symbol with function pointer - crash when called from SDL_xxx package _pSDL_GetError _SDL_GetError; extern(C) const(char)* SDL_GetError(){ return _SDL_GetError(__traits(parameters)); // proto function works fine
Ah, thank you for finding this! It wouldn’t be the first time I’ve messed that up either.
I always forget to make sure that I’m only using extern(C)
for the function pointer type (‘function uses C’s calling convention’), and never extern(C)
for the function pointer variable (‘variable should be exported with C mangling, which obviously overwrites the one from the library were binding to’).
I've tested my suspicions by changing
mangleof
tostringof
in bindbc-common and removingextern(C)
for variadic functions and everything works fine.
I’m not sure exactly what you mean by using stringof
instead of mangleof
? Surely just removing extern(C)
from the function pointer value should do the trick?
This issue has been resolved with BindBC-Common v0.1.4
I’m not sure exactly what you mean by using stringof instead of mangleof? Surely just removing extern(C) from the function pointer value should do the trick?
Code change
- lib.bindSymbol(cast(void**)&` ~ iden ~ `, here.` ~ iden ~ `.mangleof);`;
+ lib.bindSymbol(cast(void**)&` ~ iden ~ `, here.` ~ iden ~ `.stringof);`;
From perspective of dynamic loading only function types/declarations need to be extern()
and proto functions are not required at all. What surprised me is that BindBC now aims also extern C++ and objective-C. Probably for that reason there are proto functions at all. Symbol binding use ProtoFunc.mangleof
to get symbol name. As I'm using bindbc to load only C libraries changing mangleof
to stringof
fixed issue. But other languages has different mangling.
With just removed extern(C)
code would stop working. Because now mangling for variadic functions will be D mangling, so wrong symbol name during symbol loading.
proto functions are not required at all. Probably for that reason there are proto functions at all.
No. I have wrapper functions so that the syntax for function calls and references between static and dynamic builds is the same.
Because now mangling for variadic functions will be D mangling, so wrong symbol name during symbol loading.
The issue was resolved by removing only the extern from the function pointer variable, as I suspected would be the case. I don’t think you understood me.
Interesting. Side effect of that is that external library which call SDL function would call function which then calls function pointer, so we've got additional indirection (two jumps/function calls instead of one. At least for linux where symbol from executable is taken).
Back to the topic. I've tested latest bindbc-common and as I suspected it doesn't work at all for me. Removing extern(C)
fix linking problem, but introduce new issue with symbol names. Code which I added in previous response should explain that. bindSymbol
load symbol by name. There is no proto function for variadic functions. You are getting symbol name by variadic_pointer.mangleof
.
Example of error from loadSDL:
Missing Symbol: _D3sdl5error12SDL_SetErrorPUNbNiPxaYv
If SDL_SetError is not extern(C)
then SDL_SetError.mangleof
return _D3sdl5error12SDL_SetErrorPUNbNiPxaYv
(D mangling) instead of SDL_SetError
(C mangling)
Example of error from loadSDL:
Missing Symbol: _D3sdl5error12SDL_SetErrorPUNbNiPxaYv
Ah yes... I didn't test for that properly of course. 🤦♀️ My apologies, that should also be fixed now in v0.1.5
.
Initializing SDL2_mixer using
Mix_Init
supplied with any argument (other than0
) causes a segfault with error code-11
, when aloadSDL
call is present before it. This bug is present in WSL Ubuntu, where I tested it, and it might be present on native Linux machines as well.Upon debugging, the segfault is thrown in an
SDL_snprintf
call, which's invoked fromload_music_type
(an internal SDL2_mixer function), which's invoked from the user-calledMix_Init
function. Could SDL2_mixer possibly be trying to call from a different instance of SDL2 than the one loaded by bindbc-sdl?Minimal reproduceable example