libstdc++ `std::thread`, when compiled with clang, "pure virtual method called"

llvmbot commented 12 years ago


Bugzilla Link	12730
Version	trunk
OS	Linux
Reporter	LLVM Bugzilla Contributor
CC	@Bigcheese,@DougGregor,@ismail,@rprichard

Extended Description

$ cat thread.cc

#include <thread>

void f(){}

int main()
{
   std::thread t(f);
   t.join();
}

$ clang++ -v
clang version 3.2 (trunk)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang++ -std=c++11 thread.cc -o thread -pthread

$ ./thread 
pure virtual method called
terminate called without an active exception
Aborted

llvmbot commented 11 years ago

ARM part: patched in r191707

llvmbot commented 11 years ago

Should it be tied to MaxAtomicPromoteWidth ? However, the value of MaxAtomicPromoteWidth is Target specific, not CPU specific.

llvmbot commented 11 years ago

For x86, it looks "GCC_HAVE_SYNC_COMPARE_AND_SWAP_8" only apply to i586 or above. So we can't define the 4 macros for all targets. I don't know if it is safe to define GCC_HAVE_SYNC_COMPARE_ANDSWAP{1,2,4} for all targets.

And I can't find a way to compute the macros for each target either

ARMv7a has 64 bit atomics, so you can turn on __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 for that and above.

Niall

llvmbot commented 11 years ago

For x86, it looks __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 only apply to i586 or above. So we can't define the 4 macros for all targets. I don't know if it is safe to define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_{1,2,4} for all targets.

And I can't find a way to compute the macros for each target either

llvmbot commented 11 years ago

please help to review it: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130826/087642.html

llvmbot commented 11 years ago

I posted the patch for ARM to clang-commits maillist.

llvmbot commented 11 years ago

@Michael Spencer: Is there a good reason why your fix adding the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros in SVN rev 178816 is x86 only?

If you look at http://sourceware.org/ml/libc-ports/2010-11/msg00005.html you'll see that on ARM the use of atomic builtins only happens if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 is set.

This rather implies the need for every architecture target, not just x86, to set __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros according to whether that target provides CAS ops of varying sizes.

FYI I just got bit on this bug but with an ARM target, hence me finding this report.

Niall

Bigcheese commented 11 years ago

Fixed in r178816.

llvmbot commented 11 years ago

Ryan, you are a gentleman and a scholar. Thanks

rprichard commented 11 years ago

I think the root problem is that GCC is defining the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros, whereas Clang is not.

In libstdc++, std::shared_ptr<T> is derived from std::__shared_ptr<T, __gnu_cxx::__default_lock_policy>. __default_lock_policy is a compile-time constant defined in the include/ext/concurrence.h header. The value varies depending upon which macros are defined:

  // Compile time constant that indicates prefered locking policy in
  // the current configuration.
  static const _Lock_policy __default_lock_policy = 
#ifdef __GTHREADS
#if (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2) \
     && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4))
  _S_atomic;
#else
  _S_mutex;
#endif
#else
  _S_single;
#endif

With G++ on x86 and x86_64, the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros are defined by the compiler, in gcc/c-family/c-cppbuiltin.c, so __default_lock_policy is _S_atomic.

With Clang, the macros are not defined, so __default_lock_policy is _S_mutex.

(The built-in macros can be shown by passing -dM -E to gcc/clang. e.g.: gcc -dM -E - < /dev/null)

With the thread.c test case, large parts of libstdc++ are inlined and therefore compiled by Clang. The std::thread constructor calls std::thread::_M_make_routine, which creates an std::__shared_ptr<T, _S_mutex> object. The constructor passes this object to std::thread::_M_start_thread, which is an out-of-line function in libstdc++.so.6. std::thread::_M_start_thread expects an std::__shared_ptr<T, _S_atomic> object. I think this mismatch is what causes the premature destruction of the std::thread::_Impl object.

Here are the most relevant bits from the 4.7.2 include/std/thread header:

  class thread
  {
  public:
    ...
    struct _Impl_base;
    typedef shared_ptr<_Impl_base>  __shared_base_type;
    ...
    template<typename _Callable, typename... _Args>
      explicit 
      thread(_Callable&& __f, _Args&&... __args)
      {
        _M_start_thread(_M_make_routine(std::__bind_simple(
                std::forward<_Callable>(__f),
                std::forward<_Args>(__args)...)));
      }
    ...
  private:
    void
    _M_start_thread(__shared_base_type);

    template<typename _Callable>
      shared_ptr<_Impl<_Callable>>
      _M_make_routine(_Callable&& __f)
      {
    // Create and allocate full data structure, not base.
    return std::make_shared<_Impl<_Callable>>  
            (std::forward<_Callable>(__f));
      }
  };

If I define the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros when I build the test case with Clang, then the test case passes, and valgrind is happy.

This ABI issue resulted in an (invalid) gcc bug report, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42734. There seems to be some acknowledgement there that the macros are expected to change the libstdc++ ABI.

I think the driver needs to define the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros for libstdc++ compatibility. There is an issue open regarding these macros, but it does not mention ABI compatibility, http://llvm.org/bugs/show_bug.cgi?id=11174.

llvmbot commented 11 years ago

backtrace of premature destruction

llvmbot commented 11 years ago

The bug shows up as a pure virtual call, but the real error is use-after-free caused by apparent miscompilation of shared_ptr<>.

I've done deep repro on this, and what I find is that when creating a new thread, libstdc++ std:thread creates an impl object containing the virtual _M_run() function to run the thread's code. An odd detail is that this impl object contains a shared_ptr<> to itself; it's a shared_ptr<base> inside the base type, and the created impl object is always a type derived from this base. This approach, while weird, ensures that the impl object stays alive until the new thread starts running and uses it. Obviously since the impl object contains the virtual _M_run() that actually runs the thread's code, this is quite important.

But this goes wrong because Clang destroys the impl in the parent thread, before the child thread can use it.

The apparently miscompiled code is in ; I quote here from libstdc++ 4.7.2:

129     template<typename _Callable, typename... _Args>
130       explicit 
131       thread(_Callable&& __f, _Args&&... __args)
132       {
133         _M_start_thread(_M_make_routine(std::__bind_simple(
134                 std::forward<_Callable>(__f),
135                 std::forward<_Args>(__args)...)));
136       }

Clang destroys the impl with a backtrace that starts at shared_ptr::_M_destroy() ends at line 135 above. Remember, since the impl contains a shared_ptr to itself, the reference count should never fall below one until the thread exits, so this destruction is a mistake.

Running under valgrind confirms that there is a use-after-free here.

I don't know why Clang would do this.

Why does this show up as a pure virtual call? Simple: In the process of orderly destruction, the impl's vtable is reset to its base class vtable before the object is finally freed. That base vtable contains a pure virtual function, which is subsequently called. (Even though the memory was freed, the vtable pointer seems to remain unchanged.) I suppose this symptom would not occur if e.g. the base impl class had a pure virtual constructor.

Here's the backtrace, showing premature destruction. We can be sure that this destruction is premature because it's supposed to happen in the child thread, not in the parent.

(gdb) where
#&#8203;0  std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, (__gnu_cxx::_Lock_policy)1>::_M_destroy() (this=0x804e008)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:418
#&#8203;1  0x08048f4f in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)1>::_M_release (this=0x804e008)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:164
#&#8203;2  0x08048ecb in std::__shared_count<1>::~__shared_count (this=0xbffff464)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:558
#&#8203;3  0x08048e97 in std::__shared_count<1>::~__shared_count (this=0xbffff464)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:556
#&#8203;4  0x08048fac in std::__shared_ptr<std::thread::_Impl_base, 1>::~__shared_ptr (this=0xbffff460)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:813
#&#8203;5  0x08048f87 in std::shared_ptr<std::thread::_Impl_base>::~shared_ptr (this=0xbffff460)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr.h:93
#&#8203;6  0x08048e07 in std::shared_ptr<std::thread::_Impl_base>::~shared_ptr (this=0xbffff460)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr.h:93
#&#8203;7  0x08048cd8 in std::thread::thread<void (&)(), >(void (&)(void)) (this=0xbffff4c0, __f=@0x8048a60: {void (void)} 0x8048a60 <f()>)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/thread:133
#&#8203;8  0x08048c14 in std::thread::thread<void (&)(), >(void (&)(void)) (this=0xbffff4c0, __f=@0x8048a60: {void (void)} 0x8048a60 <f()>)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/thread:135
#9  0x08048a8c in main () at test.cc:9

llvmbot commented 11 years ago

Same problem with

clang version 3.2 (branches/release_32 170508)
Target: x86_64-unknown-linux-gnu
Thread model: posix
OS: Fedora 17
gcc version:  4.7.2 20120921 (Red Hat 4.7.2-2)

the backtrace:

#0  0x0000003e51a35935 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003e51a370e8 in __GI_abort () at abort.c:91
#2  0x0000003e5ce60dad in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x0000003e5ce5eea6 in __cxxabiv1::__terminate (handler=<optimized out>)
    at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:40
#4  0x0000003e5ce5eed3 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:50
#5  0x0000003e5ce5f9ef in __cxxabiv1::__cxa_pure_virtual () at ../../../../libstdc++-v3/libsupc++/pure.cc:50
#6  0x0000003e5ceb29e0 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/thread.cc:73
#7  0x0000003e51e07d14 in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:309
#8  0x0000003e51af168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

62        namespace
63        {
64          extern "C" void*
(gdb) l
65          execute_native_thread_routine(void* __p)
66          {
67            thread::_Impl_base* __t = static_cast<thread::_Impl_base*>(__p);
68            thread::__shared_base_type __local;
69            __local.swap(__t->_M_this_ptr);
70
71            __try
72              {
73                __t->_M_run(); //abort here
74              }
(gdb) l
75            __catch(...)
76              {
77                std::terminate();
78              }
79
80            return 0;
81          }
82        }
83

it seems that __t->_M_run is a pure virtual function, so the process have been terminated.

llvmbot commented 11 years ago

Same problem with

clang version 3.3 (trunk 168286)
Target: x86_64-unknown-linux-gnu
Thread model: posix

libstdc++ come with GCC 4.7.2

I suspect this is a problem with some projects lifetime. Some objects' (possibly from libstdc++'s implementation of std::thread) member function gets called while it is being destroyed. I had seen similar problem with G++ before. But that is a long time ago using pthread without C++11. But the error message is suspiciously similar.

However debugger does little to help. I cannot figure out which member function has being called.

llvmbot commented 12 years ago

same issue here:

clang version 3.2 
Target: x86_64-unknown-linux-gnu
Thread model: posix

llvmbot commented 12 years ago

I've cloned the svn repository and this problem persists. This is my clang version:

$ clang++ -v
clang version 3.2 (trunk 159725)
Target: x86_64-unknown-linux-gnu
Thread model: posix

It looks like the ref count of the shared_ptr that holds the internal representation of std::thread gets to 0, so the thread impl is getting destroyed. I can't figure out why this example does actually work on g++ 4.7, since both compilers use the same headers/libs. This could be a bug in libstdc++'s std::thread implementation, if this code didn't work fine on g++.

llvmbot commented 12 years ago

Valgrind's output

llvmbot commented 12 years ago

Turns out it was just a fluke, and is indeed still broken for me. Not sure how/why it worked...

llvmbot commented 12 years ago

I just updated clang to r156503, and it seems to work for me now. Closing.

llvm / llvm-project

libstdc++ `std::thread`, when compiled with clang, "pure virtual method called" #13102

Extended Description