Closed nefethael closed 5 years ago
It would be better to release that lock, but it's not necessary. notify_all
doesn't block on the other threads and it will release the lock when makeWriterExit
returns.
Notes from the notify_all reference:
The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.
But we should do something like this.
If you can reproduce consistently, can you attach a debugger to the process to see what this recorder thread is doing or add some print statements.
It might be hanging in this transport code that's uploading the span information.
Sorry for delay, indeed it seems to be a curl problem
#0 0x0000003d0660888d in __GI___pthread_timedjoin_ex (threadid=139711098750720, thread_return=0x0, abstime=0x0, block=<optimized out>) at /usr/src/debug/glibc/2.27-r0/git/nptl/pthread_join_common.c:89
#1 0x0000003d0aeba6c3 in std::thread::join() () from target:/usr/lib/libstdc++.so.6
#2 0x0000003d082201b6 in zipkin::ReporterImpl::~ReporterImpl() () from target:/usr/lib/libzipkin.so.0
#3 0x0000003d082202d9 in zipkin::ReporterImpl::~ReporterImpl() () from target:/usr/lib/libzipkin.so.0
#4 0x0000003d0821d221 in ?? () from target:/usr/lib/libzipkin.so.0
#5 0x0000003d08a29b30 in ?? () from target:/usr/lib/libzipkin_opentracing.so.0
#6 0x000000000044cc75 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x720710) at /add/sdk/2.5.1/sysroots/corei7-64-cube-linux/usr/include/c++/7.3.0/bits/shared_ptr_base.h:154
#7 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x70a4b8, __in_chrg=<optimized out>) at /add/sdk/2.5.1/sysroots/corei7-64-cube-linux/usr/include/c++/7.3.0/bits/shared_ptr_base.h:684
#8 std::__shared_ptr<opentracing::v2::Tracer, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x70a4b0, __in_chrg=<optimized out>)
at /add/sdk/2.5.1/sysroots/corei7-64-cube-linux/usr/include/c++/7.3.0/bits/shared_ptr_base.h:1123
#9 std::shared_ptr<opentracing::v2::Tracer>::~shared_ptr (this=0x70a4b0, __in_chrg=<optimized out>) at /add/sdk/2.5.1/sysroots/corei7-64-cube-linux/usr/include/c++/7.3.0/bits/shared_ptr.h:93
#0 0x0000003d062e82b9 in __GI___poll (fds=0x7f1106696f30, nfds=1, timeout=1000) at /usr/src/debug/glibc/2.27-r0/git/sysdeps/unix/sysv/linux/poll.c:29
#1 0x0000003d0aa2edf9 in ?? () from target:/usr/lib/libcurl.so.4
#2 0x0000003d0aa292d2 in curl_multi_wait () from target:/usr/lib/libcurl.so.4
#3 0x0000003d0aa241f3 in curl_easy_perform () from target:/usr/lib/libcurl.so.4
#4 0x0000003d08220eb1 in zipkin::ZipkinHttpTransporter::transportSpans(zipkin::SpanBuffer&) () from target:/usr/lib/libzipkin.so.0
#5 0x0000003d08220847 in zipkin::ReporterImpl::writeReports() () from target:/usr/lib/libzipkin.so.0
#6 0x0000003d0aeba46f in ?? () from target:/usr/lib/libstdc++.so.6
#7 0x0000003d06607477 in start_thread (arg=0x7f1106698700) at /usr/src/debug/glibc/2.27-r0/git/nptl/pthread_create.c:463
#8 0x0000003d062f208f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
@rnburn Do you think we could add CURLOPT_TIMEOUT curl option somewhere?
Yes, that makes sense. Could you put in a PR for it?
I'll try to create one tomorrow. Do you think adding timeout param in ZipkinHttpTransporter constructor would be acceptable?
Hello all,
When I close zipkin tracer, it hangs on writer_.join(); in zipkin_reporter_impl.cc.
I'm not sure about what happens but don't we need to unlock lock_guard in ReporterImpl::makeWriterExit before writecond.notify_all();?
Regards, Vincent