Open juj opened 2 years ago
My read of the standards is that it's undefined behavior, not unspecified behavior. For example, C17 6.5.2.2: "If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters." and "If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined".
Thanks - I have been led astray by previous conversations with people on this matter then.
Btw, do you know why that statement would use the expression is not compatible with
, instead of straight up saying is not equal to
? What constitutes a compatible type?
@llvm/issue-subscribers-backend-webassembly
Compatible types allows for a few narrow differences; for example an array type of unknown length is compatible with an array type with known length, if the element type is compatible.
6.7.6.3 defines compatibility for function types: "For two function types to be compatible, both shall specify compatible return types. Moreover, the parameter type lists, if both are present, shall agree in the number of parameters and in use of the ellipsis terminator; corresponding parameters shall have compatible types."
Great, that makes totally sense.
Back the day when porting libraries/projects through Emscripten to run on the web, I recall that FreeType 2, SDL, Harfbuzz, Unity3D and Unreal Engine (3 and 4) were among the projects that had issues with function pointer signatures not matching. With SDL if I recall correctly it was a purely oversight/typo, but in other codebases I recall that it was actually used for a purposeful effect.
I am certainly happy to argue to our devs that it is should not be done on any platform. From prior conversations with engineers, I was declared something along the lines of "yeah it is undefined by the standard, but for X86&ARM platforms specifically it is supported for limited circumstances".
Does anyone know if there actually are precedents in GCC and/or Clang that it has been intentional to enable certain types of function pointer mismatches specifically for some X86 and ARM calling conventions?
Off the top of my head of ancient "we did it by intent" usage, pthread function entry points getting defined with a signature void *pthread_main() {}
instead of void *pthread_main(void*) {}
. One such example I recall from the Open POSIX Test Suite.
Also, do any Clang Sanitizers catch these kinds of errors?
I wrote an LLVM pass for the WebAssembly backend that fixes up such casts by inserting auto-generated wrapper functions, though it doesn't handle all cases. That's all I know.
In the C/C++ standard, it is unspecified behavior to call a function pointer through a signature that does not match the signature of the function pointed to.
In different x86/ARM ABI, platform-specific and calling convention specific relaxations exist, which make it safe to perform certain types of casts while still being compatible and safe for that specific platform and calling convention. E.g. calling a function of type
void (int)
through a function signaturevoid (int, int)
may work, and the second passed argument is safely ignored, or calling a function of typeint (int)
through a function signaturevoid (int)
may work, and the return value from the function is safely ignored.In WebAssembly however, in comparison to x86/ARM, due to the security requirements, function pointers are treated more strictly, and neither types of relaxations for function pointers casts will work, but a Wasm VM will throw an exception at runtime.
That is, in Wasm, calling a function of type
int (void)
through a signaturevoid (void)
will not work, but will raise a function signature mismatch. Also, calling a function of typevoid (int)
through a signaturevoid (int, int)
will not work, but will raise an exception.However, in Wasm, calling a function
void (char)
through a signaturevoid (int)
does work, as so also does calling a functionvoid(int *)
through a signaturevoid(struct foo *)
.None of the existing casts (
static_cast
,dynamic_cast
,reinterpret_cast
and C cast) accurately capture the "what is safe and works for my target platform" aspects of the function pointer casting.This raises an interesting portability and future compatibility opportunity. It seems that on all of these target platforms and calling conventions, the rules of what types of casts work and what won't would be well codifiable to a static compile-time check - so maybe Clang front-end could take advantage of this?
I.e. it would be interesting to have a Clang compiler specific extension, something like
__fp_cast<target_sig>(myfunc)
, which would at compile time raise an error if the target compilation platform+calling convention cannot support the specific signature conversion.This would have the following benefits:
__fp_cast
utilizing test code)".__fp_cast
s, helping developers identify portability problems in existing code that used such casts.One of the most common runtime crashes when porting large application codebases to WebAssembly occurs with signature mismatches on function pointers (that were fine on x86). Debugging this in large codebases is painful, because one has to exercise all code paths on WebAssembly at runtime to validate that everything is safe - and typically the people doing the porting haven't written even 0.001% of the code.
It seems that this problem would be solvable by compile time extension to Clang (+ adopting a convention to a codebase), at least for all target/future platforms that are Clang-based, as the compiler would be able to catch fp casts that won't be supported by the target platform.
When I read the documentation page at https://clang.llvm.org/docs/LanguageExtensions.html it seems that such a feature does not exists from before? Would this kind of cast make sense to add?
If one existed, it would probably be possible to enforce a programming convention to always need to use a
__fp_cast
in our Unity3D codebase to improve identifying portability problems to Unity's future platforms.CC @sunfishcode @kripken @tlively @sbc100