Open llvmbot opened 12 years ago
ARM part: patched in r191707
Should it be tied to MaxAtomicPromoteWidth
?
However, the value of MaxAtomicPromoteWidth
is Target
specific, not CPU specific.
For x86, it looks "GCC_HAVE_SYNC_COMPARE_AND_SWAP_8" only apply to i586 or above. So we can't define the 4 macros for all targets. I don't know if it is safe to define GCC_HAVE_SYNC_COMPARE_ANDSWAP{1,2,4} for all targets.
And I can't find a way to compute the macros for each target either
ARMv7a has 64 bit atomics, so you can turn on __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
for that and above.
Niall
For x86, it looks __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
only apply to i586 or above. So we can't define the 4 macros for all targets.
I don't know if it is safe to define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_{1,2,4}
for all targets.
And I can't find a way to compute the macros for each target either
please help to review it: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130826/087642.html
I posted the patch for ARM to clang-commits maillist.
@Michael Spencer: Is there a good reason why your fix adding the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]
macros in SVN rev 178816 is x86 only?
If you look at http://sourceware.org/ml/libc-ports/2010-11/msg00005.html you'll see that on ARM the use of atomic builtins only happens if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
is set.
This rather implies the need for every architecture target, not just x86, to set __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]
macros according to whether that target provides CAS ops of varying sizes.
FYI I just got bit on this bug but with an ARM target, hence me finding this report.
Niall
Fixed in r178816.
Ryan, you are a gentleman and a scholar. Thanks
I think the root problem is that GCC is defining the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]
macros, whereas Clang is not.
In libstdc++, std::shared_ptr<T>
is derived from
std::__shared_ptr<T, __gnu_cxx::__default_lock_policy>
. __default_lock_policy is a compile-time constant defined in the include/ext/concurrence.h header
. The
value varies depending upon which macros are defined:
// Compile time constant that indicates prefered locking policy in
// the current configuration.
static const _Lock_policy __default_lock_policy =
#ifdef __GTHREADS
#if (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2) \
&& defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4))
_S_atomic;
#else
_S_mutex;
#endif
#else
_S_single;
#endif
With G++ on x86 and x86_64, the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]
macros are defined by the compiler, in gcc/c-family/c-cppbuiltin.c
, so __default_lock_policy
is _S_atomic
.
With Clang, the macros are not defined, so __default_lock_policy
is _S_mutex
.
(The built-in macros can be shown by passing -dM -E
to gcc/clang. e.g.: gcc -dM -E - < /dev/null
)
With the thread.c test case, large parts of libstdc++ are inlined and therefore compiled by Clang. The std::thread
constructor calls std::thread::_M_make_routine
, which creates an std::__shared_ptr<T, _S_mutex>
object. The constructor passes this object to std::thread::_M_start_thread
, which is an out-of-line function in libstdc++.so.6
. std::thread::_M_start_thread
expects an std::__shared_ptr<T, _S_atomic>
object. I think this mismatch is what causes the premature destruction of the std::thread::_Impl
object.
Here are the most relevant bits from the 4.7.2 include/std/thread
header:
class thread
{
public:
...
struct _Impl_base;
typedef shared_ptr<_Impl_base> __shared_base_type;
...
template<typename _Callable, typename... _Args>
explicit
thread(_Callable&& __f, _Args&&... __args)
{
_M_start_thread(_M_make_routine(std::__bind_simple(
std::forward<_Callable>(__f),
std::forward<_Args>(__args)...)));
}
...
private:
void
_M_start_thread(__shared_base_type);
template<typename _Callable>
shared_ptr<_Impl<_Callable>>
_M_make_routine(_Callable&& __f)
{
// Create and allocate full data structure, not base.
return std::make_shared<_Impl<_Callable>>
(std::forward<_Callable>(__f));
}
};
If I define the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]
macros when I build the test case with Clang, then the test case passes, and valgrind is happy.
This ABI issue resulted in an (invalid) gcc bug report, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42734. There seems to be some acknowledgement there that the macros are expected to change the libstdc++ ABI.
I think the driver needs to define the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]
macros for libstdc++ compatibility. There is an issue open regarding these macros, but it does not mention ABI compatibility,
http://llvm.org/bugs/show_bug.cgi?id=11174.
The bug shows up as a pure virtual call, but the real error is use-after-free caused by apparent miscompilation of shared_ptr<>
.
I've done deep repro on this, and what I find is that when creating a new thread, libstdc++ std:thread creates an impl object containing the virtual _M_run()
function to run the thread's code. An odd detail is that this impl object contains a shared_ptr<>
to itself; it's a shared_ptr<base>
inside the base type, and the created impl object is always a type derived from this base. This approach, while weird, ensures that the impl object stays alive until the new thread starts running and uses it. Obviously since the impl object contains the virtual _M_run()
that actually runs the thread's code, this is quite important.
But this goes wrong because Clang destroys the impl in the parent thread, before the child thread can use it.
The apparently miscompiled code is in
129 template<typename _Callable, typename... _Args>
130 explicit
131 thread(_Callable&& __f, _Args&&... __args)
132 {
133 _M_start_thread(_M_make_routine(std::__bind_simple(
134 std::forward<_Callable>(__f),
135 std::forward<_Args>(__args)...)));
136 }
Clang destroys the impl with a backtrace that starts at shared_ptr::_M_destroy()
ends at line 135 above. Remember, since the impl contains a shared_ptr
to itself, the reference count should never fall below one until the thread exits, so this destruction is a mistake.
Running under valgrind confirms that there is a use-after-free here.
I don't know why Clang would do this.
Why does this show up as a pure virtual call? Simple: In the process of orderly destruction, the impl's vtable is reset to its base class vtable before the object is finally freed. That base vtable contains a pure virtual function, which is subsequently called. (Even though the memory was freed, the vtable pointer seems to remain unchanged.) I suppose this symptom would not occur if e.g. the base impl class had a pure virtual constructor.
Here's the backtrace, showing premature destruction. We can be sure that this destruction is premature because it's supposed to happen in the child thread, not in the parent.
(gdb) where
#​0 std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, (__gnu_cxx::_Lock_policy)1>::_M_destroy() (this=0x804e008)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:418
#​1 0x08048f4f in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)1>::_M_release (this=0x804e008)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:164
#​2 0x08048ecb in std::__shared_count<1>::~__shared_count (this=0xbffff464)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:558
#​3 0x08048e97 in std::__shared_count<1>::~__shared_count (this=0xbffff464)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:556
#​4 0x08048fac in std::__shared_ptr<std::thread::_Impl_base, 1>::~__shared_ptr (this=0xbffff460)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:813
#​5 0x08048f87 in std::shared_ptr<std::thread::_Impl_base>::~shared_ptr (this=0xbffff460)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr.h:93
#​6 0x08048e07 in std::shared_ptr<std::thread::_Impl_base>::~shared_ptr (this=0xbffff460)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr.h:93
#​7 0x08048cd8 in std::thread::thread<void (&)(), >(void (&)(void)) (this=0xbffff4c0, __f=@0x8048a60: {void (void)} 0x8048a60 <f()>)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/thread:133
#​8 0x08048c14 in std::thread::thread<void (&)(), >(void (&)(void)) (this=0xbffff4c0, __f=@0x8048a60: {void (void)} 0x8048a60 <f()>)
at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/thread:135
#9 0x08048a8c in main () at test.cc:9
Same problem with
clang version 3.2 (branches/release_32 170508)
Target: x86_64-unknown-linux-gnu
Thread model: posix
OS: Fedora 17
gcc version: 4.7.2 20120921 (Red Hat 4.7.2-2)
the backtrace:
#0 0x0000003e51a35935 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003e51a370e8 in __GI_abort () at abort.c:91
#2 0x0000003e5ce60dad in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x0000003e5ce5eea6 in __cxxabiv1::__terminate (handler=<optimized out>)
at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:40
#4 0x0000003e5ce5eed3 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:50
#5 0x0000003e5ce5f9ef in __cxxabiv1::__cxa_pure_virtual () at ../../../../libstdc++-v3/libsupc++/pure.cc:50
#6 0x0000003e5ceb29e0 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
at ../../../../../libstdc++-v3/src/c++11/thread.cc:73
#7 0x0000003e51e07d14 in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:309
#8 0x0000003e51af168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
62 namespace
63 {
64 extern "C" void*
(gdb) l
65 execute_native_thread_routine(void* __p)
66 {
67 thread::_Impl_base* __t = static_cast<thread::_Impl_base*>(__p);
68 thread::__shared_base_type __local;
69 __local.swap(__t->_M_this_ptr);
70
71 __try
72 {
73 __t->_M_run(); //abort here
74 }
(gdb) l
75 __catch(...)
76 {
77 std::terminate();
78 }
79
80 return 0;
81 }
82 }
83
it seems that __t->_M_run
is a pure virtual function, so the process have been terminated.
Same problem with
clang version 3.3 (trunk 168286)
Target: x86_64-unknown-linux-gnu
Thread model: posix
libstdc++ come with GCC 4.7.2
I suspect this is a problem with some projects lifetime. Some objects' (possibly from libstdc++'s implementation of std::thread
) member function gets called while it is being destroyed. I had seen similar problem with G++ before. But that is a long time ago using pthread without C++11. But the error message is suspiciously similar.
However debugger does little to help. I cannot figure out which member function has being called.
same issue here:
clang version 3.2
Target: x86_64-unknown-linux-gnu
Thread model: posix
I've cloned the svn repository and this problem persists. This is my clang version:
$ clang++ -v
clang version 3.2 (trunk 159725)
Target: x86_64-unknown-linux-gnu
Thread model: posix
It looks like the ref count of the shared_ptr
that holds the internal representation of std::thread
gets to 0, so the thread impl is getting destroyed. I can't figure out why this example does actually work on g++ 4.7, since both compilers use the same headers/libs. This could be a bug in libstdc++'s std::thread implementation, if this code didn't work fine on g++.
Turns out it was just a fluke, and is indeed still broken for me. Not sure how/why it worked...
I just updated clang to r156503, and it seems to work for me now. Closing.
Extended Description