open-telemetry / opentelemetry-cpp

The OpenTelemetry C++ Client
https://opentelemetry.io/
Apache License 2.0
839 stars 402 forks source link

Crash in OTLP HTTP Export: WITH_ASYNC_EXPORT_PREVIEW=ON #3036

Open msiddhu opened 3 weeks ago

msiddhu commented 3 weeks ago

Describe your environment Build and running on Ubuntu 24.04 ARM64 (VM inside M1 Macbook)

cmake .. -DBUILD_SHARED_LIBS=ON -DWITH_OTLP_HTTP=ON -DWITH_OTLP_FILE=ON 
-DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DBUILD_TESTING=OFF -DWITH_ASYNC_EXPORT_PREVIEW=ON

Protobuf Version: 3.21.12 OTEL Version: 1.16.1

Steps to reproduce main.cpp.txt tracer_common.h.txt

Backtrace

#0  __pthread_kill_implementation (threadid=281474840629312, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x0000fffff78c7690 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x0000fffff787cb3c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x0000fffff7867e00 in __GI_abort () at ./stdlib/abort.c:79
#4  0x0000fffff78babf0 in __libc_message_impl (fmt=fmt@entry=0xfffff79a4a60 "%s\n") at ../sysdeps/posix/libc_fatal.c:132
#5  0x0000fffff78d212c in malloc_printerr (str=str@entry=0xfffff799fe10 "double free or corruption (!prev)") at ./malloc/malloc.c:5772
#6  0x0000fffff78d4248 in _int_free_merge_chunk (av=av@entry=0xfffff79f0a50 <main_arena>, p=p@entry=0xaaaaaab1ab40, size=size@entry=496) at ./malloc/malloc.c:4679
#7  0x0000fffff78d43f8 in _int_free (av=0xfffff79f0a50 <main_arena>, p=p@entry=0xaaaaaab1ab40, have_lock=<optimized out>, have_lock@entry=0) at ./malloc/malloc.c:4646
#8  0x0000fffff78d6fa8 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3398
#9  0x0000fffff7e07320 in opentelemetry::v1::exporter::otlp::OtlpHttpExporter::~OtlpHttpExporter() () from /usr/local/lib/libopentelemetry_exporter_otlp_http.so
#10 0x0000aaaaaaabce78 in std::default_delete<opentelemetry::v1::sdk::trace::SpanExporter>::operator() (this=0xaaaaaab1bba8, __ptr=0xaaaaaab1ab50) at /usr/include/c++/13/bits/unique_ptr.h:99
#11 0x0000aaaaaaabb53c in std::unique_ptr<opentelemetry::v1::sdk::trace::SpanExporter, std::default_delete<opentelemetry::v1::sdk::trace::SpanExporter> >::~unique_ptr (this=0xaaaaaab1bba8 = {...}, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/unique_ptr.h:404
#12 0x0000fffff7f6d460 in opentelemetry::v1::sdk::trace::BatchSpanProcessor::~BatchSpanProcessor() () from /usr/local/lib/libopentelemetry_trace.so
#13 0x0000fffff7f6d488 in opentelemetry::v1::sdk::trace::BatchSpanProcessor::~BatchSpanProcessor() () from /usr/local/lib/libopentelemetry_trace.so
warning: RTTI symbol not found for class 'opentelemetry::v1::sdk::trace::BatchSpanProcessor'
#14 0x0000aaaaaaabcefc in std::default_delete<opentelemetry::v1::sdk::trace::SpanProcessor>::operator() (this=0xaaaaaab2baa0, __ptr=0xaaaaaab1bba0) at /usr/include/c++/13/bits/unique_ptr.h:99
#15 0x0000aaaaaaabb5b8 in std::unique_ptr<opentelemetry::v1::sdk::trace::SpanProcessor, std::default_delete<opentelemetry::v1::sdk::trace::SpanProcessor> >::~unique_ptr (this=0xaaaaaab2baa0 = {...}, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/unique_ptr.h:404
#16 0x0000fffff7f379fc in opentelemetry::v1::sdk::trace::MultiSpanProcessor::ProcessorNode::~ProcessorNode() () from /usr/local/lib/libopentelemetry_trace.so
#17 0x0000fffff7f37aa8 in opentelemetry::v1::sdk::trace::MultiSpanProcessor::Cleanup() () from /usr/local/lib/libopentelemetry_trace.so
#18 0x0000fffff7f37900 in opentelemetry::v1::sdk::trace::MultiSpanProcessor::~MultiSpanProcessor() () from /usr/local/lib/libopentelemetry_trace.so
#19 0x0000fffff7f37928 in opentelemetry::v1::sdk::trace::MultiSpanProcessor::~MultiSpanProcessor() () from /usr/local/lib/libopentelemetry_trace.so
warning: RTTI symbol not found for class 'opentelemetry::v1::sdk::trace::MultiSpanProcessor'
#20 0x0000aaaaaaabcefc in std::default_delete<opentelemetry::v1::sdk::trace::SpanProcessor>::operator() (this=0xaaaaaab152a8, __ptr=0xaaaaaab1cf90) at /usr/include/c++/13/bits/unique_ptr.h:99
#21 0x0000aaaaaaabb5b8 in std::unique_ptr<opentelemetry::v1::sdk::trace::SpanProcessor, std::default_delete<opentelemetry::v1::sdk::trace::SpanProcessor> >::~unique_ptr (this=0xaaaaaab152a8 = {...}, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/unique_ptr.h:404
#22 0x0000fffff7f44d78 in opentelemetry::v1::sdk::trace::TracerContext::~TracerContext() () from /usr/local/lib/libopentelemetry_trace.so
#23 0x0000fffff7f44dbc in opentelemetry::v1::sdk::trace::TracerContext::~TracerContext() () from /usr/local/lib/libopentelemetry_trace.so
#24 0x0000aaaaaaabd118 in std::default_delete<opentelemetry::v1::sdk::trace::TracerContext>::operator() (this=0xaaaaaab27c90, __ptr=0xaaaaaab15230) at /usr/include/c++/13/bits/unique_ptr.h:99
#25 0x0000fffff7f6032c in std::_Sp_counted_deleter<opentelemetry::v1::sdk::trace::TracerContext*, std::default_delete<opentelemetry::v1::sdk::trace::TracerContext>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /usr/local/lib/libopentelemetry_trace.so
warning: RTTI symbol not found for class 'std::_Sp_counted_deleter<opentelemetry::v1::sdk::trace::TracerContext*, std::default_delete<opentelemetry::v1::sdk::trace::TracerContext>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
#26 0x0000aaaaaaab7ca4 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0xaaaaaab27c80) at /usr/include/c++/13/bits/shared_ptr_base.h:346
#27 0x0000aaaaaaabb014 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0xaaaaaab2b630, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr_base.h:1071
#28 0x0000fffff7f4843c in std::__shared_ptr<opentelemetry::v1::sdk::trace::TracerContext, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() () from /usr/local/lib/libopentelemetry_trace.so
#29 0x0000fffff7f4845c in std::shared_ptr<opentelemetry::v1::sdk::trace::TracerContext>::~shared_ptr() () from /usr/local/lib/libopentelemetry_trace.so
#30 0x0000fffff7f54084 in opentelemetry::v1::sdk::trace::Tracer::~Tracer() () from /usr/local/lib/libopentelemetry_trace.so
#31 0x0000fffff7f540c4 in opentelemetry::v1::sdk::trace::Tracer::~Tracer() () from /usr/local/lib/libopentelemetry_trace.so
#32 0x0000fffff7f6008c in std::_Sp_counted_ptr<opentelemetry::v1::sdk::trace::Tracer*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /usr/local/lib/libopentelemetry_trace.so
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<opentelemetry::v1::sdk::trace::Tracer*, (__gnu_cxx::_Lock_policy)2>'
#33 0x0000aaaaaaabba14 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use (this=0xaaaaaab2b810) at /usr/include/c++/13/bits/shared_ptr_base.h:175
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<opentelemetry::v1::sdk::trace::Tracer*, (__gnu_cxx::_Lock_policy)2>'
#34 0x0000aaaaaaaba070 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use_cold (this=0xaaaaaab2b810) at /usr/include/c++/13/bits/shared_ptr_base.h:199
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<opentelemetry::v1::sdk::trace::Tracer*, (__gnu_cxx::_Lock_policy)2>'
#35 0x0000aaaaaaab7d74 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0xaaaaaab2b810) at /usr/include/c++/13/bits/shared_ptr_base.h:353
#36 0x0000aaaaaaabb014 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0xaaaaaab2ae18, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr_base.h:1071
#37 0x0000fffff7f485e4 in std::__shared_ptr<opentelemetry::v1::sdk::trace::Tracer, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() () from /usr/local/lib/libopentelemetry_trace.so
#38 0x0000fffff7f48604 in std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>::~shared_ptr() () from /usr/local/lib/libopentelemetry_trace.so
#39 0x0000fffff7f56d08 in void std::_Destroy<std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer> >(std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*) () from /usr/local/lib/libopentelemetry_trace.so
#40 0x0000fffff7f53914 in void std::_Destroy_aux<false>::__destroy<std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*>(std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*, std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*) () from /usr/local/lib/libopentelemetry_trace.so
#41 0x0000fffff7f50d98 in void std::_Destroy<std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*>(std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*, std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>*) () from /usr/local/lib/libopentelemetry_trace.so
#42 0x0000fffff7f4a100 in std::vector<std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer>, std::allocator<std::shared_ptr<opentelemetry::v1::sdk::trace::Tracer> > >::~vector() () from /usr/local/lib/libopentelemetry_trace.so
#43 0x0000fffff7f45d5c in opentelemetry::v1::sdk::trace::TracerProvider::~TracerProvider() () from /usr/local/lib/libopentelemetry_trace.so
#44 0x0000fffff7f45d88 in opentelemetry::v1::sdk::trace::TracerProvider::~TracerProvider() () from /usr/local/lib/libopentelemetry_trace.so
#45 0x0000aaaaaaabd1bc in std::default_delete<opentelemetry::v1::sdk::trace::TracerProvider>::operator() (this=0xaaaaaab27cd0, __ptr=0xaaaaaab296c0) at /usr/include/c++/13/bits/unique_ptr.h:99
#46 0x0000aaaaaaac0818 in std::_Sp_counted_deleter<opentelemetry::v1::sdk::trace::TracerProvider*, std::default_delete<opentelemetry::v1::sdk::trace::TracerProvider>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0xaaaaaab27cc0) at /usr/include/c++/13/bits/shared_ptr_base.h:527
#47 0x0000aaaaaaab7ca4 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0xaaaaaab27cc0) at /usr/include/c++/13/bits/shared_ptr_base.h:346
#48 0x0000aaaaaaabb014 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0xaaaaaaae0098 <opentelemetry::v1::trace::Provider::GetProvider()::provider+16>, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr_base.h:1071
#49 0x0000aaaaaaab9ef8 in std::__shared_ptr<opentelemetry::v1::trace::TracerProvider, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0xaaaaaaae0090 <opentelemetry::v1::trace::Provider::GetProvider()::provider+8>, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr_base.h:1524
#50 0x0000aaaaaaab9f18 in std::shared_ptr<opentelemetry::v1::trace::TracerProvider>::~shared_ptr (this=0xaaaaaaae0090 <opentelemetry::v1::trace::Provider::GetProvider()::provider+8> = {...}, __in_chrg=<optimized out>) at /usr/include/c++/13/bits/shared_ptr.h:175
#51 0x0000aaaaaaabccc4 in opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider>::shared_ptr_wrapper::~shared_ptr_wrapper (this=0xaaaaaaae0088 <opentelemetry::v1::trace::Provider::GetProvider()::provider>, __in_chrg=<optimized out>) at /usr/local/include/opentelemetry/nostd/shared_ptr.h:50
#52 0x0000aaaaaaabb3a8 in opentelemetry::v1::nostd::shared_ptr<opentelemetry::v1::trace::TracerProvider>::~shared_ptr (this=0xaaaaaaae0088 <opentelemetry::v1::trace::Provider::GetProvider()::provider>, __in_chrg=<optimized out>) at /usr/local/include/opentelemetry/nostd/shared_ptr.h:118
#53 0x0000fffff787f228 in __run_exit_handlers (status=0, listp=0xfffff79f0670 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:108
#54 0x0000fffff787f30c in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:138
#55 0x0000fffff78684c8 in __libc_start_call_main (main=main@entry=0xaaaaaaab7958 <main()>, argc=argc@entry=1, argv=argv@entry=0xfffffffff9d8) at ../sysdeps/nptl/libc_start_call_main.h:74
#56 0x0000fffff7868598 in __libc_start_main_impl (main=0xaaaaaaab7958 <main()>, argc=1, argv=0xfffffffff9d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:360
#57 0x0000aaaaaaab71f0 in _start ()

Additional context

This happens during the destructor call of OtlpHttpExporter and when Async export is enabled. Didn't understood why this is happening. Need some help.

owent commented 2 weeks ago

We encountered a similar issue before, but the crash no longer occurs after merging #2983. Could you please test the main branch to see if it can be reproduced?

msiddhu commented 2 weeks ago

Yes. The issue is still present with the latest main branch. I checked it right now with the example code I mentioned here

owent commented 1 week ago

I have no simular environment to run this case, could you please try to use valgrind to run this test and we can analysis which object is freed twice.

valgrind --tool=memcheck --log-file=memcheck.log --leak-check=full --show-reachable=yes test-executable