llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.06k stars 11.98k forks source link

Passing C-style variadic functions through std::function results in "Illegal instruction" error at runtime in programs compiled with clang #61444

Open DD-L opened 1 year ago

DD-L commented 1 year ago

Hello, Can someone explain this?

The following code can be compiled using GCC/MSVC and the resulting program can work normally, outputting the number 1:

#include <functional>
#include <iostream>
#include <string>

typedef std::function<void(int, const std::string&, int)> Fn;

void foo(int a, ...) {
    std::cout << a << std::endl;
}

int main() {
    Fn fn(foo);
    fn(1, "str", 3);
    return 0;
}

However, in almost all versions of Clang (<= ver 16.0.0), the resulting program reports an "Illegal instruction" error. Is this considered a bug in Clang?

Details:

* thread #1, name = 'a.out', stop reason = signal SIGILL: illegal instruction operand
    frame #0: 0x00005555555557e6 a.out`void std::__invoke_impl<void, void (*&)(int, ...), int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int>((null)=__invoke_other @ 0x00007fffffffe038, __f=0x00007fffffffe198, __args=0x00007fffffffe134, __args=0x00007fffffffe178, __args=0x00007fffffffe124)(int, ...), int&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int&&) at invoke.h:61:37
   58     template<typename _Res, typename _Fn, typename... _Args>
   59       constexpr _Res
   60       __invoke_impl(__invoke_other, _Fn&& __f, _Args&&... __args)
-> 61       { return std::forward<_Fn>(__f)(std::forward<_Args>(__args)...); }
   62
   63     template<typename _Res, typename _MemFun, typename _Tp, typename... _Args>
   64       constexpr _Res
(lldb) bt
* thread #1, name = 'a.out', stop reason = signal SIGILL: illegal instruction operand
  * frame #0: 0x00005555555557e6 a.out`void std::__invoke_impl<void, void (*&)(int, ...), int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int>((null)=__invoke_other @ 0x00007fffffffe038, __f=0x00007fffffffe198, __args=0x00007fffffffe134, __args=0x00007fffffffe178, __args=0x00007fffffffe124)(int, ...), int&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int&&) at invoke.h:61:37
    frame #1: 0x000055555555574c a.out`std::enable_if<__and_<std::is_void<void>, std::__is_invocable<void (*&)(int, ...), int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int> >::value, void>::type std::__invoke_r<void, void (__fn=0x00007fffffffe198, __args=0x00007fffffffe134, __args=0x00007fffffffe178, __args=0x00007fffffffe124)(int, ...), int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int>(void (*&)(int, ...), int&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int&&) at invoke.h:154:7
    frame #2: 0x00005555555555fc a.out`std::_Function_handler<void (int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int), void (*)(int, ...)>::_M_invoke(__functor=0x00007fffffffe198, __args=0x00007fffffffe134, __args=0x00007fffffffe178, __args=0x00007fffffffe124) at std_function.h:290:9
    frame #3: 0x000055555555547d a.out`std::function<void (int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)>::operator(this=0x00007fffffffe198, __args=1, __args=0x00007fffffffe178, __args=3)(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) const at std_function.h:590:9
    frame #4: 0x00005555555552cc a.out`main at 1.cpp:13:5
    frame #5: 0x00007ffff7a86d90 libc.so.6`__libc_start_call_main(main=(a.out`main at 1.cpp:11), argc=1, argv=0x00007fffffffe2d8) at libc_start_call_main.h:58:16
    frame #6: 0x00007ffff7a86e40 libc.so.6`__libc_start_main_impl(main=(a.out`main at 1.cpp:11), argc=1, argv=0x00007fffffffe2d8, init=0x00007ffff7ffd040, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007fffffffe2c8) at libc-start.c:392:3
    frame #7: 0x0000555555555175 a.out`_start + 37
DimitryAndric commented 1 year ago

Aren't you "lying" about the function prototype here though? Since the Fn type is declared as a function wrapper for void(int, const std::string&, int) while you are actually calling a void(int, ...) function. As far as I can see, clang inserts a 'ud2' instruction into the virtual destructor for the std::function base class, probably to catch this type of abuse.

Btw, I am unsure, but from some quick searches and reading through spec, it seems that std::function doesn't actually support any form of variadic functions.

DD-L commented 1 year ago

I know that doing this may be "risky", but it works well under GCC/MSVC (it can be compiled correctly and run well). https://godbolt.org/z/Wva5xTh9Y

What I mean is that for ordinary developers, they should be encouraged to trust C++'s "type system". If the program compiled by Clang reports an "Illegal instruction" error when running, then this "risky" code should be reported as a compilation failure during the Clang compilation phase.

shafik commented 1 year ago

I believe this is undefined behavior, I don't think there is a way for std::function to pass on the arguments correctly to a C-style variadic function.

If we are in undefined behavior arena then inconsistent behavior between compilers is one possible symptom, crashing is again another possible result.

CC @mordante who may be able to provide more details from the libc++ side.

tahonermann commented 1 year ago

I think the behavior here is conditionally supported with implementation-defined behavior per [expr.call]p11 due to the attempt to pass an object of type std::string to foo(). A conforming implementation can handle this as (essentially) undefined behavior.

mordante commented 1 year ago

I know that doing this may be "risky", but it works well under GCC/MSVC (it can be compiled correctly and run well). https://godbolt.org/z/Wva5xTh9Y

If this is undefined behaviour, GCC's and MSVC's behaviour can differ from Clang's. I didn't bother to look in the Standard to see whether it's valid. (edit I see Tom just posted a link to the relevant part of the Standard.)

Looking at Clang here https://godbolt.org/z/b6dvK3Y9e it's clear it unconditionally inserts an ud2 instruction. This happens with libstdc++ and libc++ so I don't expect this to be an issue in the library.

What I mean is that for ordinary developers, they should be encouraged to trust C++'s "type system". If the program compiled by Clang reports an "Illegal instruction" error when running, then this "risky" code should be reported as a compilation failure during the Clang compilation phase.

Using variadic functions is not what I consider trusting the "type system".

DD-L commented 1 year ago

Thank you all, especially thanks to @mordante.

Using variadic functions is not what I consider trusting the "type system".

C-style variadic functions cannot be guaranteed by C++'s type system, which is a problem of C since it allows for the occurrence of such undefined behavior. This is a cost that must be paid in order to be compatible with C.

I have temporarily compromised because I have been convinced. However, the unease in my heart still lingers, and it has changed my intuitive understanding of std::function.

DD-L commented 1 year ago

Or, is there a way to emit a compilation warning at some stage during Clang compilation, reminding developers that such code may lead to UB issues?

Could you please consider this suggestion?

shafik commented 1 year ago

Note, you will receive a diagnostic if you attempt to pass a non-trivial type to a variadic directly: https://godbolt.org/z/9c79z9j9Y

#include <iostream>
#include <string>

void foo(int a, ...) {
    std::cout << a << std::endl;
}

int main() {
    std::string s;
    foo(1, s, 3);
    return 0;
}
error: cannot pass object of non-trivial type 'std::string' (aka 'basic_string<char>') through variadic function; call will abort at runtime [-Wnon-pod-varargs]
    foo(1, s, 3);
           ^

I am not sure how difficult it would be to diagnose in your original case though.

tahonermann commented 1 year ago

I wouldn't be surprised if that diagnostic would be issued by the instantiation of std::function if it weren't suppressed due to occurrence in a system header.

I suspect it is possible to implement std::function such that it would fail a static_assert on detection of an object of class type being passed to a variadic function. I wouldn't want to require that though; I would put that in the implementation-heroics bucket.

mordante commented 1 year ago

That might be possible, but I'm not sure libc++ should. If Clang decides to add support for non-trivial destructible types in variadic functions, libc++'s static_assert is wrong. (I guess the non-trivial destructible part is the issue that cause Clang to add an ud2 instruction.)

@DD-L I really suggest to only use POD types in variadic functions. Or even better use an alternative like templated functions with parameter packs.

DD-L commented 1 year ago

Note, you will receive a diagnostic if you attempt to pass a non-trivial type to a variadic directly: https://godbolt.org/z/9c79z9j9Y

@shafik Yes, since Clang can provide a compile-time error report (with -Wnon-pod-varargs) when passing non-trivial types to variadic functions without using std::function wrapping, it should also give corresponding diagnostic information when using std::function.

DD-L commented 1 year ago

@DD-L I really suggest to only use POD types in variadic functions.

@mordante Yes, changing the reference types in the code to pointer types would avoid these troubles.

typedef std::function<void(int, const std::string*, int)> Fn;

Fn fn(foo);
std::string str;
fn(1, &str, 3);