google / tcmalloc

Apache License 2.0
4.31k stars 463 forks source link

Deadlock appeared again #198

Closed lhsoft closed 11 months ago

lhsoft commented 1 year ago

We have used the tcmalloc of master branch recently and got the following stacks: `#1 0x0000000000b05f36 in AbslInternalSpinLockDelay (w=w@entry=0x22ec830 , value=value@entry=9) at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock_linux.inc:63

2 0x0000000000493231 in absl::base_internal::SpinLockDelay (scheduling_mode=absl::base_internal::SCHEDULE_KERNEL_ONLY, loop=1, value=9,

w=0x22ec830 <tcmalloc::tcmalloc_internal::pageheap_lock>) at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock_wait.h:91

3 absl::base_internal::SpinLock::SlowLock (this=0x22ec830 )

at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock.cc:162

4 0x0000000000ad87cc in absl::base_internal::SpinLock::Lock (this=) at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock.h:75

5 absl::base_internal::SpinLockHolder::SpinLockHolder (l=, this=)

at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock.h:196

6 tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator::LockAndAlloc (

from_released=0x7f6d64298f6f, objects_per_span=1, n=..., this=0x2181288 <tcmalloc::tcmalloc_internal::Static::page_allocator_+695848>)
at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/huge_page_aware_allocator.h:541

7 tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator::New (

this=0x2181288 <tcmalloc::tcmalloc_internal::Static::page_allocator_+695848>, n=..., objects_per_span=1)
at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/huge_page_aware_allocator.h:526

8 0x0000000000ab181d in tcmalloc::tcmalloc_internal::PageAllocator::New (tag=tcmalloc::tcmalloc_internal::MemoryTag::kSampled, objects_per_span=1, n=..., this=)

at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/page_allocator.h:177

9 tcmalloc::sized_ptr_t tcmalloc::tcmalloc_internal::SampleifyAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::NullOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::NonSizeReturningPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy> >(tcmalloc::tcmalloc_internal::Static&, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::NullOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::NonSizeReturningPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, unsigned long, unsigned long, void, tcmalloc::tcmalloc_internal::Span) [clone .isra.0] () at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/allocation_sampling.h:354

10 0x0000000000ab2628 in tcmalloc::tcmalloc_internal::SampleSmallAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (state=..., size_class=4, weight=, requested_size=40, policy=..., res=...)

at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/sampler.h:197

11 tcmalloc::tcmalloc_internal::(anonymous namespace)::AllocSmall<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (size=40, size_class=4, policy=...) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:658

12 slow_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (size=40, policy=...)

at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:987

13 0x00000000012881d6 in fast_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (

size=size@entry=40, policy=...) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/sizemap.h:149

14 TCMallocInternalNew (size=size@entry=40) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:1135

15 0x0000000000cda6fa in google::protobuf::Arena::InternalHelper::New () at /usr/include/c++/9/new:174

16 google::protobuf::Arena::CreateMessageInternal (arena=) at external/protobuf~3.19.6/src/google/protobuf/arena.h:559

17 google::protobuf::Arena::CreateMaybeMessage (arena=)`

The #111 had fixed the deadlock, but it seems that the deadlock appears again

lhsoft commented 1 year ago

gdb.txt @ckennelly any idea for this?

lhsoft commented 1 year ago

It looks like the deadlock happens in SampleifyAllocation,

ckennelly commented 1 year ago

@lhsoft Which thread holds the lock?

SampelifyAllocation seems to be spinning/waiting for the lock, but it doesn't hold pageheap_lock when it invokes state.page_allocator().New().

lhsoft commented 1 year ago

@ckennelly Is there any way to find the thread which holds the lock? It looks like that the spinlock doesn't have the info of owner.

lhsoft commented 1 year ago

@ckennelly The root cause is that we have a thread to dump the memory stats periodically by calling PrintStats

Thread 20 (Thread 0x7f01a6068700 (LWP 36400)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x0000000000ea3763 in AbslInternalSpinLockDelay (w=0x29fff80 <tcmalloc::tcmalloc_internal::pageheap_lock>, value=9)
    at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock_linux.inc:63
#2  0x0000000000ea34e8 in absl::base_internal::SpinLockDelay (w=0x29fff80 <tcmalloc::tcmalloc_internal::pageheap_lock>, value=9, loop=1, 
    scheduling_mode=absl::base_internal::SCHEDULE_KERNEL_ONLY) at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock_wait.h:92
#3  0x0000000000ea306f in absl::base_internal::SpinLock::SlowLock (this=0x29fff80 <tcmalloc::tcmalloc_internal::pageheap_lock>)
    at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock.cc:162
#4  0x0000000000d5f0b7 in absl::base_internal::SpinLock::Lock (this=0x29fff80 <tcmalloc::tcmalloc_internal::pageheap_lock>)
    at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock.h:75
#5  0x0000000000d5f1d7 in absl::base_internal::SpinLockHolder::SpinLockHolder (this=0x7f01a6060b20, l=0x29fff80 <tcmalloc::tcmalloc_internal::pageheap_lock>)
    at external/_main~data_deps_ext~com_google_absl/absl/base/internal/spinlock.h:196
#6  0x0000000000dc8fdf in tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator<tcmalloc::tcmalloc_internal::huge_page_allocator_internal::StaticForwarder>::LockAndAlloc (this=0x2ac16e8 <tcmalloc::tcmalloc_internal::Static::page_allocator_+695848>, n=..., objects_per_span=1, 
    from_released=0x7f01a6060b70) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/huge_page_aware_allocator.h:541
#7  0x0000000000dc7a2e in tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator<tcmalloc::tcmalloc_internal::huge_page_allocator_internal::StaticForwarder>::New (this=0x2ac16e8 <tcmalloc::tcmalloc_internal::Static::page_allocator_+695848>, n=..., objects_per_span=1)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/huge_page_aware_allocator.h:526
#8  0x0000000000d613aa in tcmalloc::tcmalloc_internal::PageAllocator::New (this=0x2a178c0 <tcmalloc::tcmalloc_internal::Static::page_allocator_>, n=..., 
    objects_per_span=1, tag=tcmalloc::tcmalloc_internal::MemoryTag::kSampled)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/page_allocator.h:177
#9  0x0000000000d4f237 in tcmalloc::tcmalloc_internal::SampleifyAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (state=..., policy=..., requested_size=513, weight=2097256, 
    size_class=30, obj=0x4cf3bdef680, span=0x0) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/allocation_sampling.h:354
#10 0x0000000000d4cc21 in tcmalloc::tcmalloc_internal::SampleSmallAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (state=..., policy=..., requested_size=513, weight=2097256, 
    size_class=30, res=...) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/allocation_sampling.h:494
#11 0x0000000000d39aa1 in tcmalloc::tcmalloc_internal::(anonymous namespace)::AllocSmall<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (size=513, size_class=30, policy=...)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:658
#12 slow_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (
    policy=..., size=513) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:987
#13 0x0000000001b0ce57 in fast_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::CppOomPolicy, tcmalloc::tcmalloc_internal::DefaultAlignPolicy> > (size=513, policy=...) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:1037
#14 TCMallocInternalNew (size=513) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:1135
#15 0x0000000001ada86c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct (this=0x7f01a6061460, 
    __n=<optimized out>, __c=<optimized out>) at /mnt/ssd01/liuhu/workspace/src/gcc-9.4.0/x86_64-redhat-linux/libstdc++-v3/include/bits/char_traits.h:300
#16 0x0000000000460736 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> > (
    this=0x7f01a6061460, __n=512, __c=0 '\000', __a=...) at /usr/local/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/basic_string.h:546
#17 0x0000000000e8aff9 in absl::str_format_internal::(anonymous namespace)::FallbackToSnprintf<double> (v=4.8301623795663097e-05, conv=..., 
    sink=0x7f01a6061900) at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/float_conversion.cc:979
#18 0x0000000000e89e0e in absl::str_format_internal::(anonymous namespace)::FloatToSink<double> (v=4.8301623795663097e-05, conv=..., sink=0x7f01a6061900)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/float_conversion.cc:1386
#19 0x0000000000e88b6d in absl::str_format_internal::ConvertFloatImpl (v=4.8301623795663097e-05, conv=..., sink=0x7f01a6061900)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/float_conversion.cc:1452
#20 0x0000000000e7e800 in absl::str_format_internal::(anonymous namespace)::ConvertFloatArg<double> (v=4.8301623795663097e-05, conv=..., sink=0x7f01a6061900)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/arg.cc:288
#21 0x0000000000e7e35e in absl::str_format_internal::FormatConvertImpl (v=4.8301623795663097e-05, conv=..., sink=0x7f01a6061900)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/arg.cc:451
#22 0x0000000000e81be3 in absl::str_format_internal::FormatArgImpl::Dispatch<double> (arg=..., spec=..., out=0x7f01a6061900)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/arg.h:574
#23 0x0000000000e85934 in absl::str_format_internal::FormatArgImplFriend::Convert<absl::str_format_internal::FormatArgImpl> (arg=..., conv=..., 
    out=0x7f01a6061900) at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/arg.h:388
#24 0x0000000000e836b3 in absl::str_format_internal::(anonymous namespace)::DefaultConverter::ConvertOne (this=0x7f01a6061830, bound=...)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/bind.cc:143
#25 0x0000000000e84e6a in absl::str_format_internal::(anonymous namespace)::ConverterConsumer<absl::str_format_internal::(anonymous namespace)::DefaultConverter>::ConvertOne (this=0x7f01a6061830, conv=..., conv_string=...) at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/bind.cc:116
#26 0x0000000000e84cd7 in absl::str_format_internal::ParseFormatString<absl::str_format_internal::(anonymous namespace)::ConverterConsumer<absl::str_format_internal::(anonymous namespace)::DefaultConverter> > (src=..., consumer=...)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/parser.h:99
#27 0x0000000000e84312 in absl::str_format_internal::(anonymous namespace)::ConvertAll<absl::str_format_internal::(anonymous namespace)::DefaultConverter> (
    format=..., args=..., converter=...) at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/bind.cc:131
#28 0x0000000000e83d16 in absl::str_format_internal::FormatUntyped (raw_sink=..., format=..., args=...)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/bind.cc:201
#29 0x0000000000e8410e in absl::str_format_internal::SnprintF (output=0x30a4ffc2a842 "", size=87997, format=..., args=...)
    at external/_main~data_deps_ext~com_google_absl/absl/strings/internal/str_format/bind.cc:248
#30 0x0000000000de6673 in absl::SNPrintF<unsigned long, unsigned long, unsigned long, double, double, double>(char*, unsigned long, absl::str_format_internal::FormatSpecTemplate<(ArgumentToConv<unsigned long>)(), (ArgumentToConv<unsigned long>)(), (ArgumentToConv<unsigned long>)(), (ArgumentToConv<double>)(), (ArgumentToConv<double>)(), (ArgumentToConv<double>)()> const&, unsigned long const&, unsigned long const&, unsigned long const&, double const&, double const&, double const&) (output=0x30a4ffc2a842 "", size=87997, format=...) at external/_main~data_deps_ext~com_google_absl/absl/strings/str_format.h:463
#31 0x0000000000de5f33 in tcmalloc::tcmalloc_internal::Printer::printf<unsigned long, unsigned long, unsigned long, double, double, double>(absl::str_format_internal::FormatSpecTemplate<(ArgumentToConv<unsigned long>)(), (ArgumentToConv<unsigned long>)(), (ArgumentToConv<unsigned long>)(), (ArgumentToConv<double>)(), (ArgumentToConv<double>)(), (ArgumentToConv<double>)()> const&, unsigned long const&, unsigned long const&, unsigned long const&, double const&, double const&, double const&) (this=0x7f01a60673d0, format=...) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/internal/logging.h:192
#32 0x0000000000de469b in tcmalloc::tcmalloc_internal::PageAllocInfo::<lambda(const tcmalloc::tcmalloc_internal::PageAllocInfo::Counts&, tcmalloc::tcmalloc_internal::Length, tcmalloc::tcmalloc_internal::Length)>::operator()(const tcmalloc::tcmalloc_internal::PageAllocInfo::Counts &, tcmalloc::tcmalloc_internal::Length, tcmalloc::tcmalloc_internal::Length) const (__closure=0x7f01a6062010, c=..., nmin=..., nmax=...)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/stats.cc:364
#33 0x0000000000de48c7 in tcmalloc::tcmalloc_internal::PageAllocInfo::Print (this=0x2a178c8 <tcmalloc::tcmalloc_internal::Static::page_allocator_+8>, 
    out=0x7f01a60673d0) at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/stats.cc:372
#34 0x0000000000dca95c in tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator<tcmalloc::tcmalloc_internal::huge_page_allocator_internal::StaticForwarder>::Print (this=0x2a178c0 <tcmalloc::tcmalloc_internal::Static::page_allocator_>, out=0x7f01a60673d0, everything=true)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/huge_page_aware_allocator.h:877
#35 0x0000000000dc8a24 in tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator<tcmalloc::tcmalloc_internal::huge_page_allocator_internal::StaticForwarder>::Print (this=0x2a178c0 <tcmalloc::tcmalloc_internal::Static::page_allocator_>, out=0x7f01a60673d0)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/huge_page_aware_allocator.h:821
#36 0x0000000000d9497c in tcmalloc::tcmalloc_internal::PageAllocator::Print (this=0x2a178c0 <tcmalloc::tcmalloc_internal::Static::page_allocator_>, 
    out=0x7f01a60673d0, tag=tcmalloc::tcmalloc_internal::MemoryTag::kNormalP0)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/page_allocator.h:260
#37 0x0000000000d90c67 in tcmalloc::tcmalloc_internal::DumpStats (out=0x7f01a60673d0, level=2)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/global_stats.cc:398
#38 0x0000000000d2956d in tcmalloc::tcmalloc_internal::TCMalloc_Internal_GetStats (
    buffer=0x30a4ffc00000 "See https://github.com/google/tcmalloc/tree/master/docs/stats.md for an explanation of this page\n", '-' <repeats 48 times>, "\nMALLOC:      599367736 (  571.6 MiB) Bytes in use by a"..., buffer_length=262143)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:211
#39 0x0000000000d294db in tcmalloc::tcmalloc_internal::MallocExtension_Internal_GetStats (ret=0x7f01a6067490)
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/tcmalloc.cc:196
#40 0x0000000000e09b8e in tcmalloc::MallocExtension::GetStats[abi:cxx11]() ()
    at external/_main~data_deps_ext~com_github_google_tcmalloc/tcmalloc/malloc_extensio.cc:206
#41 0x00000000005bc047 in taishan::node::RunStatCollector::dump_memory_stats (this=0x4cebfc94ec0) at node/src/run_stat_collector.cpp:253
#42 0x00000000005c1ad2 in std::__invoke_impl<void, void (taishan::node::RunStatCollector::*)(), taishan::node::RunStatCollector*> (
    __f=@0x4cebfcf24f0: (void (taishan::node::RunStatCollector::*)(class taishan::node::RunStatCollector * const)) 0x5bbea2 <taishan::node::RunStatCollector::dump_memory_stats()>, __t=@0x4cebfcf24e8: 0x4cebfc94ec0) at /usr/local/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/invoke.h:73
#43 0x00000000005c1a14 in std::__invoke<void (taishan::node::RunStatCollector::*)(), taishan::node::RunStatCollector*> (
    __fn=@0x4cebfcf24f0: (void (taishan::node::RunStatCollector::*)(class taishan::node::RunStatCollector * const)) 0x5bbea2 <taishan::node::RunStatCollector::dump_memory_stats()>) at /usr/local/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/bits/invoke.h:95
#44 0x00000000005c1983 in std::thread::_Invoker<std::tuple<void (taishan::node::RunStatCollector::*)(), taishan::node::RunStatCollector*> >::_M_invoke<0ul, 1ul> (this=0x4cebfcf24e8) at /usr/local/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/thread:244
#45 0x00000000005c193e in std::thread::_Invoker<std::tuple<void (taishan::node::RunStatCollector::*)(), taishan::node::RunStatCollector*> >::operator() (
    this=0x4cebfcf24e8) at /usr/local/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/thread:251
#46 0x00000000005c1922 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (taishan::node::RunStatCollector::*)(), taishan::node::RunStatCollector*> > >::_M_run (this=0x4cebfcf24e0) at /usr/local/lib/gcc/x86_64-redhat-linux/9/../../../../include/c++/9/thread:195
#47 0x0000000001add3d0 in std::execute_native_thread_routine (__p=0x4cebfcf24e0) at ../../../.././libstdc++-v3/src/c++11/thread.cc:80
#48 0x00007f01b0096fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#49 0x00007f01afc1106f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The func tcmalloc::tcmalloc_internal::huge_page_allocator_internal::HugePageAwareAllocator<tcmalloc::tcmalloc_internal::huge_page_allocator_internal::StaticForwarder>::Print holds the pageheap_lock before return

inline void HugePageAwareAllocator<Forwarder>::Print(Printer* out,
                                                       bool everything) {
    SmallSpanStats small;
    LargeSpanStats large;
    BackingStats bstats;
    PageAgeHistograms ages(absl::base_internal::CycleClock::Now());
    absl::base_internal::SpinLockHolder h(&pageheap_lock);

    if (everything) {
      regions_.Print(out);
      out->printf("\n");
      cache_.Print(out);
      alloc_.Print(out);
      out->printf("\n");

      // Use statistics
      info_.Print(out);

      // and age tracking.
      ages.Print("HugePageAware", out);
    }

If any of print func allocate memory and call SampleifyAllocation will cause deadlock

ckennelly commented 1 year ago

Thanks for that very helpful stack trace!

I'll ask one of the Abseil StrFormat maintainers, but this is somewhat surprising. We rely on this not allocating internally too.