Open shoda-tibco opened 5 years ago
I'm seeing very high memory usage on rocksdb when running a process for a week. I am wondering if this is related. Any ideas? Commenting here to bump this...
I am also seeing the same definitely lost
records reported by valgrind with RocksDB 6.11.4, it is enough to trigger a compaction with db_bench
to see it.
I found this issue after seeing the same 2 valgrind stacks for __cxa_thread_atexit
in a unit test from our build while upgrading from 5.15.10 to 6.11.4.
The 2 __cxa_thread_atexit
leaks start to appear in ~ 5.18.0 with https://github.com/facebook/rocksdb/commit/d6ec288703c8fc53b54be9e3e3f3ffd6a7487c63 from 17 Oct 2018 for https://github.com/facebook/rocksdb/pull/4226 , in the PR there is a discussion about how to perform cleanup.
This is on RHEL 7.8 with libstdc++-4.8.5-39 which is currently the latest.
( I need to edit util/gflags_compat.h
probably because the gflags RPM is old)
<< #define GFLAGS_NAMESPACE google
>> #define GFLAGS_NAMESPACE gflags
> make db_bench -j48
This is for 6.11.4 but the same 2 valgrind __cxa_thread_atexit "definite lost" stacks appear back to 5.18.0
> valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
==42753== Memcheck, a memory error detector
==42753== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==42753== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==42753== Command: ./db_bench --benchmarks=fillseq,compact --num 1
==42753==
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==42753== Warning: unimplemented fcntl command: 1036
RocksDB: version 6.11
Date: Thu Aug 13 11:08:22 2020
CPU: 48 * Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
CPUCache: 30720 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 0.0 MB (estimated)
FileSize: 0.0 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: skip_list
Perf Level: 1
WARNING: Assertions are enabled; benchmarks unnecessarily slow
------------------------------------------------
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==42753== Warning: unimplemented fcntl command: 1036
DB path: [/tmp/rocksdbtest-XXX/dbbench]
fillseq : 117194.000 micros/op 8 ops/sec; 0.0 MB/s
DB path: [/tmp/rocksdbtest-XXX/dbbench]
==42753== Warning: unimplemented fcntl command: 1036
==42753== Warning: unimplemented fcntl command: 1036
compact : 472803.000 micros/op 2 ops/sec;
==42753==
==42753== HEAP SUMMARY:
==42753== in use at exit: 80,220 bytes in 1,545 blocks
==42753== total heap usage: 22,932 allocs, 21,387 frees, 9,072,040 bytes allocated
==42753==
==42753== 24 bytes in 1 blocks are definitely lost in loss record 586 of 1,482
==42753== at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==42753== by 0x61D6C5D: __cxa_thread_atexit (atexit_thread.cc:130)
==42753== by 0x64AEC8: UnknownInlinedFun (instrumented_mutex.cc:71)
==42753== by 0x64AEC8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==42753== by 0x506B56: InstrumentedMutexLock (instrumented_mutex.h:56)
==42753== by 0x506B56: rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2303)
==42753== by 0x5073A2: rocksdb::DBImpl::BGWorkFlush(void*) (db_impl_compaction_flush.cc:2162)
==42753== by 0x723CEB: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==42753== by 0x723F30: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==42753== by 0x622F06F: execute_native_thread_routine (thread.cc:84)
==42753== by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==42753== by 0x6A978DC: clone (in /usr/lib64/libc-2.17.so)
==42753==
==42753== 24 bytes in 1 blocks are definitely lost in loss record 587 of 1,482
==42753== at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==42753== by 0x61D6C5D: __cxa_thread_atexit (atexit_thread.cc:130)
==42753== by 0x64AEC8: UnknownInlinedFun (instrumented_mutex.cc:71)
==42753== by 0x64AEC8: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==42753== by 0x50787B: InstrumentedMutexLock (instrumented_mutex.h:56)
==42753== by 0x50787B: rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2382)
==42753== by 0x50825B: rocksdb::DBImpl::BGWorkCompaction(void*) (db_impl_compaction_flush.cc:2174)
==42753== by 0x723CEB: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==42753== by 0x723F30: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==42753== by 0x622F06F: execute_native_thread_routine (thread.cc:84)
==42753== by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==42753== by 0x6A978DC: clone (in /usr/lib64/libc-2.17.so)
==42753==
==42753== 24,576 (16,384 direct, 8,192 indirect) bytes in 1 blocks are definitely lost in loss record 1,482 of 1,482
==42753== at 0x4C2C375: memalign (vg_replace_malloc.c:908)
==42753== by 0x4C2C486: posix_memalign (vg_replace_malloc.c:1073)
==42753== by 0x67FEB6: rocksdb::port::cacheline_aligned_alloc(unsigned long) (port_posix.cc:210)
==42753== by 0x46218A: rocksdb::LRUCache::LRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:477)
==42753== by 0x46238C: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==42753== by 0x46238C: _S_construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==42753== by 0x46238C: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==42753== by 0x46238C: _Sp_counted_ptr_inplace<long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:399)
==42753== by 0x46238C: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==42753== by 0x46238C: _S_construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==42753== by 0x46238C: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==42753== by 0x46238C: __shared_count<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:502)
==42753== by 0x46238C: __shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:957)
==42753== by 0x46238C: shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:316)
==42753== by 0x46238C: allocate_shared<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:598)
==42753== by 0x46238C: make_shared<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:614)
==42753== by 0x46238C: rocksdb::NewLRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:572)
==42753== by 0x43A5C5: rocksdb::Benchmark::NewCache(long) (db_bench_tool.cc:2656)
==42753== by 0x43E885: rocksdb::Benchmark::Benchmark() (db_bench_tool.cc:2685)
==42753== by 0x4312EE: rocksdb::db_bench_tool(int, char**) (db_bench_tool.cc:7155)
==42753== by 0x69BB554: (below main) (in /usr/lib64/libc-2.17.so)
==42753==
==42753== LEAK SUMMARY:
==42753== definitely lost: 16,432 bytes in 3 blocks
==42753== indirectly lost: 8,192 bytes in 64 blocks
==42753== possibly lost: 0 bytes in 0 blocks
==42753== still reachable: 55,596 bytes in 1,478 blocks
==42753== of which reachable via heuristic:
==42753== stdstring : 977 bytes in 11 blocks
==42753== suppressed: 0 bytes in 0 blocks
==42753== Reachable blocks (those to which a pointer was found) are not shown.
==42753== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==42753==
==42753== For lists of detected and suppressed errors, rerun with: -s
==42753== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
@thatsafunnyname the LRU cache leaking is intended. This can be avoided if you remove this line: https://github.com/facebook/rocksdb/blob/master/tools/db_bench_tool.cc#L2764 . This line is to intentionally leak blocks in the block cache to speed up the program shut down process.
I can't reproduce the __cxa_thread_atexit leak in my environment in master. Will try 6.11.4
Hmm, I can't reproduce it in 6.11.4 either, Which allocator are you using? We usually allocator in glibc when running valgrind (DISABLE_JEMALLOC=1), as jemalloc doesn't go well with valgrind.
Thanks for taking a look and trying to reproduce.
I had started with the allocator we run with, so jemalloc 5.2.0, but had also tried jemalloc 4.5.0 (valgrind support being dropped in v5) and jemalloc 3.6.0. I saw the same 2 __cxa_thread_atexit leaks in all of them. I just also tried building with jemalloc disabled to use the glibc allocator:
make DISABLE_JEMALLOC=1 db_bench -j48
I checked db_bench is not using libjemalloc with ldd
and strace
:
I still see same 2 __cxa_thread_atexit leaks.
I will try on an AWS EC2 AL2 host (no jemalloc libs installed) tomorrow, at the moment I am getting link problems to gflags when trying to build 6.11.4 on AL2.
@thatsafunnyname that's interesting. Thanks for trying it. Let us know what you found.
Update summary:
I ran into (illegal instruction
) problems running valgrind with RDB built on AL2, I had to use PORTABLE=1
when building RDB to avoid this, but when built with PORTABLE=1
on an AL2 host I could not reproduce the __cxa_thread_atexit
lost blocks.
I did an AL2 on-host build of the latest valgrind (valgrind-3.17.0.GIT) and it had the same valgrind problem (illegal instruction) when RDB was not built with PORTABLE=1
.
While I was building valgrind I also built the latest valgrind from git back on a RHEL7 host (with no jemalloc libs present), it still reported the __cxa_thread_atexit
lost blocks.
I also built RDB with PORTABLE=1
back on a RHEL7 host (with no jemalloc libs present), it still reported the __cxa_thread_atexit
lost blocks.
I am going to test with some compilers other than gcc-c++-4.8.5-39
on the RHEL7 host.
Details of the valgrind error:
On a newly started:
"Amazon Linux 2" AMI with kernel 4.14.186-146.268.amzn2.x86_64 - amzn2-ami-hvm-2.0.20200722.0-x86_64-gp2 (ami-02354e95b39ca8dec)
sudo yum install gcc gcc-c++ # 7.3.1-9
mkdir gflags
cd gflags/
wget 'https://github.com/gflags/gflags/archive/v2.0.tar.gz'
gzip -d v2.0.tar.gz
tar -xvf v2.0.tar
cd gflags-2.0/
./configure
make
sudo make install
export LD_LIBRARY_PATH=/usr/local/lib
sudo yum install snappy-devel # 1.1.0-3
mkdir ~/rocksdb
cd ~/rocksdb
wget https://github.com/facebook/rocksdb/archive/v6.11.4.tar.gz
gzip -d v6.11.4.tar.gz
tar -xvf v6.11.4.tar
cd rocksdb-6.11.4
make db_bench
sudo yum install valgrind # 3.13.0-9
valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
==16138== Memcheck, a memory error detector
==16138== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16138== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16138== Command: ./db_bench --benchmarks=fillseq,compact --num 1
==16138==
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0x75 0x28 0xEF 0xC9 0x48 0xC7 0x43 0x10
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==16138== valgrind: Unrecognised instruction at address 0x74c3cc.
==16138== at 0x74C3CC: __mutex_base (std_mutex.h:68)
==16138== by 0x74C3CC: mutex (std_mutex.h:94)
==16138== by 0x74C3CC: Data (sync_point_impl.h:26)
==16138== by 0x74C3CC: SyncPoint (sync_point.cc:20)
==16138== by 0x74C3CC: rocksdb::SyncPoint::GetInstance() (sync_point.cc:16)
==16138== by 0x63603B: rocksdb::Env::Default() (env_posix.cc:507)
==16138== by 0x69870A: rocksdb::DBOptions::DBOptions() (options.h:404)
==16138== by 0x43934E: rocksdb::Options::Options() (options.h:1152)
==16138== by 0x40A966: __static_initialization_and_destruction_0(int, int) [clone .constprop.1047] (db_bench_tool.cc:317)
==16138== by 0x8658C4: __libc_csu_init (in /home/ec2-user/rocksdb/rocksdb-6.11.4/db_bench)
==16138== by 0x6183FBA: (below main) (in /usr/lib64/libc-2.26.so)
==16138== Your program just tried to execute an instruction that Valgrind
==16138== did not recognise. There are two possible reasons for this.
==16138== 1. Your program has a bug and erroneously jumped to a non-code
==16138== location. If you are running Memcheck and you just saw a
==16138== warning about a bad jump, it's probably your program's fault.
==16138== 2. The instruction is legitimate but Valgrind doesn't handle it,
==16138== i.e. it's Valgrind's fault. If you think this is the case or
==16138== you are not sure, please let us know and we'll try to fix it.
==16138== Either way, Valgrind will now raise a SIGILL signal which will
==16138== probably kill your program.
==16138==
==16138== Process terminating with default action of signal 4 (SIGILL)
==16138== Illegal opcode at address 0x74C3CC
==16138== at 0x74C3CC: __mutex_base (std_mutex.h:68)
==16138== by 0x74C3CC: mutex (std_mutex.h:94)
==16138== by 0x74C3CC: Data (sync_point_impl.h:26)
==16138== by 0x74C3CC: SyncPoint (sync_point.cc:20)
==16138== by 0x74C3CC: rocksdb::SyncPoint::GetInstance() (sync_point.cc:16)
==16138== by 0x63603B: rocksdb::Env::Default() (env_posix.cc:507)
==16138== by 0x69870A: rocksdb::DBOptions::DBOptions() (options.h:404)
==16138== by 0x43934E: rocksdb::Options::Options() (options.h:1152)
==16138== by 0x40A966: __static_initialization_and_destruction_0(int, int) [clone .constprop.1047] (db_bench_tool.cc:317)
==16138== by 0x8658C4: __libc_csu_init (in /home/ec2-user/rocksdb/rocksdb-6.11.4/db_bench)
==16138== by 0x6183FBA: (below main) (in /usr/lib64/libc-2.26.so)
==16138==
==16138== HEAP SUMMARY:
==16138== in use at exit: 13,818 bytes in 217 blocks
==16138== total heap usage: 218 allocs, 1 frees, 86,522 bytes allocated
==16138==
==16138== LEAK SUMMARY:
==16138== definitely lost: 0 bytes in 0 blocks
==16138== indirectly lost: 0 bytes in 0 blocks
==16138== possibly lost: 0 bytes in 0 blocks
==16138== still reachable: 13,818 bytes in 217 blocks
==16138== suppressed: 0 bytes in 0 blocks
==16138== Reachable blocks (those to which a pointer was found) are not shown.
==16138== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==16138==
==16138== For counts of detected and suppressed errors, rerun with: -v
==16138== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Illegal instruction
I saw the same failure when using an AL2 on-host complied valgrind-3.17.0.GIT.
When I build with PORTABLE=1 make db_bench -j48
on the AL2 host valgrind does not fail with the illegal instruction and does not report the __cxa_thread_atexit
lost blocks.
Using gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) from devtoolset-4 on the RHEL7 host, valgrind reports the 2 __cxa_thread_atexit
lost blocks records as does clang++ 3.7.1.
On a RHEL8.2 host on AWS with kernel 4.18.0-193.el8.x86_64 , RHEL-8.2.0_HVM-20200423-x86_64-0-Hourly2-GP2 (ami-098f16afa9edf40be) I had to build RDB with PORTABLE=1
to avoid the illegal instruction error from valgrind, and could not reproduce the __cxa_thread_atexit
lost blocks.
Details for the RHEL8.2 host on AWS EC2.
sudo yum install wget make gcc gcc-c++ # 8.3.1-5
mkdir gflags
cd gflags/
wget 'https://github.com/gflags/gflags/archive/v2.0.tar.gz'
gzip -d v2.0.tar.gz
tar -xvf v2.0.tar
cd gflags-2.0/
./configure
make
sudo make install
export LD_LIBRARY_PATH=/usr/local/lib
sudo yum install snappy-devel # 1.1.7-5 ( have to use an additional repo )
mkdir ~/rocksdb
cd ~/rocksdb
wget https://github.com/facebook/rocksdb/archive/v6.11.4.tar.gz
gzip -d v6.11.4.tar.gz
tar -xvf v6.11.4.tar
cd rocksdb-6.11.4
PORTABLE=1 make db_bench -j12
sudo yum install valgrind # 1:3.15.0-11
valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
These are the steps to reproduce the __cxa_thread_atexit
lost blocks, using a new RHEL7.7 host on AWS EC2.
As I can not reproduce it on "Amazon Linux 2" or RHEL8, and it only ever seems to be 24 bytes in each of the loss records (per __cxa_thread_atexit
?), I will add a valgrind suppresion for it.
On a RHEL7.7 host on AWS with kernel 3.10.0-1062.1.2.el7.x86_64 , RHEL-7.7_HVM-20190923-x86_64-0-Hourly2-GP2 (ami-029c0fbe456d58bd1) , building RocksDB with PORTABLE=1
.
sudo yum install wget make gcc gcc-c++ # 4.8.5-39
mkdir gflags
cd gflags/
wget 'https://github.com/gflags/gflags/archive/v2.0.tar.gz'
gzip -d v2.0.tar.gz
tar -xvf v2.0.tar
cd gflags-2.0/
./configure
make
sudo make install
export LD_LIBRARY_PATH=/usr/local/lib
sudo yum install snappy-devel # 1.1.0-3
( may have to use an additional repo such as
"sudo rpm -i http://mirror.centos.org/centos/7/os/x86_64/Packages/snappy-devel-1.1.0-3.el7.x86_64.rpm" )
mkdir ~/rocksdb
cd ~/rocksdb
wget https://github.com/facebook/rocksdb/archive/v6.11.4.tar.gz
gzip -d v6.11.4.tar.gz
tar -xvf v6.11.4.tar
cd rocksdb-6.11.4
PORTABLE=1 make db_bench -j48
sudo yum install valgrind # 1:3.15.0-11
valgrind --leak-check=full --show-leak-kinds=definite ./db_bench --benchmarks="fillseq,compact" --num 1
==17872== Memcheck, a memory error detector
==17872== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==17872== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==17872== Command: ./db_bench --benchmarks=fillseq,compact --num 1
==17872==
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==17872== Warning: unimplemented fcntl command: 1036
RocksDB: version 6.11
Date: Fri Aug 14 13:27:27 2020
CPU: 48 * Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz
CPUCache: 36608 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 0.0 MB (estimated)
FileSize: 0.0 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Compression sampling rate: 0
Memtablerep: skip_list
Perf Level: 1
WARNING: Assertions are enabled; benchmarks unnecessarily slow
------------------------------------------------
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
==17872== Warning: unimplemented fcntl command: 1036
DB path: [/tmp/rocksdbtest-1000/dbbench]
fillseq : 94966.000 micros/op 10 ops/sec; 0.0 MB/s
DB path: [/tmp/rocksdbtest-1000/dbbench]
==17872== Warning: unimplemented fcntl command: 1036
==17872== Warning: unimplemented fcntl command: 1036
compact : 392140.000 micros/op 2 ops/sec;
==17872==
==17872== HEAP SUMMARY:
==17872== in use at exit: 80,206 bytes in 1,545 blocks
==17872== total heap usage: 22,989 allocs, 21,444 frees, 9,105,619 bytes allocated
==17872==
==17872== 24 bytes in 1 blocks are definitely lost in loss record 586 of 1,482
==17872== at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==17872== by 0x58E1C5D: __cxa_thread_atexit (in /usr/lib64/libstdc++.so.6.0.19)
==17872== by 0x643938: UnknownInlinedFun (instrumented_mutex.cc:71)
==17872== by 0x643938: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==17872== by 0x502166: InstrumentedMutexLock (instrumented_mutex.h:56)
==17872== by 0x502166: rocksdb::DBImpl::BackgroundCallFlush(rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2303)
==17872== by 0x502982: rocksdb::DBImpl::BGWorkFlush(void*) (db_impl_compaction_flush.cc:2162)
==17872== by 0x716E6B: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==17872== by 0x7170A0: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==17872== by 0x593A06F: ??? (in /usr/lib64/libstdc++.so.6.0.19)
==17872== by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==17872== by 0x61A28DC: clone (in /usr/lib64/libc-2.17.so)
==17872==
==17872== 24 bytes in 1 blocks are definitely lost in loss record 587 of 1,482
==17872== at 0x4C2A7E6: operator new(unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:387)
==17872== by 0x58E1C5D: __cxa_thread_atexit (in /usr/lib64/libstdc++.so.6.0.19)
==17872== by 0x643938: UnknownInlinedFun (instrumented_mutex.cc:71)
==17872== by 0x643938: rocksdb::InstrumentedMutex::Lock() (instrumented_mutex.cc:26)
==17872== by 0x502E5B: InstrumentedMutexLock (instrumented_mutex.h:56)
==17872== by 0x502E5B: rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority) (db_impl_compaction_flush.cc:2382)
==17872== by 0x50380B: rocksdb::DBImpl::BGWorkCompaction(void*) (db_impl_compaction_flush.cc:2174)
==17872== by 0x716E6B: rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long) (threadpool_imp.cc:266)
==17872== by 0x7170A0: rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*) (threadpool_imp.cc:307)
==17872== by 0x593A06F: ??? (in /usr/lib64/libstdc++.so.6.0.19)
==17872== by 0x5042EA4: start_thread (in /usr/lib64/libpthread-2.17.so)
==17872== by 0x61A28DC: clone (in /usr/lib64/libc-2.17.so)
==17872==
==17872== 24,576 (16,384 direct, 8,192 indirect) bytes in 1 blocks are definitely lost in loss record 1,482 of 1,482
==17872== at 0x4C2C375: memalign (vg_replace_malloc.c:908)
==17872== by 0x4C2C43F: posix_memalign (vg_replace_malloc.c:1072)
==17872== by 0x678066: rocksdb::port::cacheline_aligned_alloc(unsigned long) (port_posix.cc:210)
==17872== by 0x45EEEB: rocksdb::LRUCache::LRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:477)
==17872== by 0x45F0DC: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==17872== by 0x45F0DC: _S_construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==17872== by 0x45F0DC: construct<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==17872== by 0x45F0DC: _Sp_counted_ptr_inplace<long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:399)
==17872== by 0x45F0DC: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (new_allocator.h:120)
==17872== by 0x45F0DC: _S_construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:254)
==17872== by 0x45F0DC: construct<std::_Sp_counted_ptr_inplace<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, (__gnu_cxx::_Lock_policy)2u>, const std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (alloc_traits.h:393)
==17872== by 0x45F0DC: __shared_count<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:502)
==17872== by 0x45F0DC: __shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr_base.h:957)
==17872== by 0x45F0DC: shared_ptr<std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:316)
==17872== by 0x45F0DC: allocate_shared<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:598)
==17872== by 0x45F0DC: make_shared<rocksdb::LRUCache, long unsigned int&, int&, bool&, double&, std::shared_ptr<rocksdb::MemoryAllocator>, bool&, rocksdb::CacheMetadataChargePolicy&> (shared_ptr.h:614)
==17872== by 0x45F0DC: rocksdb::NewLRUCache(unsigned long, int, bool, double, std::shared_ptr<rocksdb::MemoryAllocator>, bool, rocksdb::CacheMetadataChargePolicy) (lru_cache.cc:572)
==17872== by 0x4387C5: rocksdb::Benchmark::NewCache(long) (db_bench_tool.cc:2656)
==17872== by 0x43C9C5: rocksdb::Benchmark::Benchmark() (db_bench_tool.cc:2685)
==17872== by 0x42FD6E: rocksdb::db_bench_tool(int, char**) (db_bench_tool.cc:7155)
==17872== by 0x60C6554: (below main) (in /usr/lib64/libc-2.17.so)
==17872==
==17872== LEAK SUMMARY:
==17872== definitely lost: 16,432 bytes in 3 blocks
==17872== indirectly lost: 8,192 bytes in 64 blocks
==17872== possibly lost: 0 bytes in 0 blocks
==17872== still reachable: 55,582 bytes in 1,478 blocks
==17872== of which reachable via heuristic:
==17872== stdstring : 979 bytes in 11 blocks
==17872== suppressed: 0 bytes in 0 blocks
==17872== Reachable blocks (those to which a pointer was found) are not shown.
==17872== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==17872==
==17872== For lists of detected and suppressed errors, rerun with: -s
==17872== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
Also noticed at https://jira.mariadb.org/browse/MDEV-21788
0x58E1C5D: __cxa_thread_atexit (in /usr/lib64/libstdc++.so.6.0.19)
is
0x58E1C5D: __cxa_thread_atexit (atexit_thread.cc:130)
extern "C" int
__cxxabiv1::__cxa_thread_atexit (void (*dtor)(void *), void *obj, void */*dso_handle*/)
_GLIBCXX_NOTHROW
{
// Do this initialization once.
if (__gthread_active_p ())
{
// When threads are active use __gthread_once.
static __gthread_once_t once = __GTHREAD_ONCE_INIT;
__gthread_once (&once, key_init);
}
else
{
// And when threads aren't active use a static local guard.
static bool queued;
if (!queued)
{
queued = true;
std::atexit (run);
}
}
elt *first;
if (__gthread_active_p ())
first = static_cast<elt*>(__gthread_getspecific (key));
else
first = single_thread;
elt *new_elt = new (std::nothrow) elt; <----------------- line 130 HERE
if (!new_elt)
return -1;
new_elt->destructor = dtor;
new_elt->object = obj;
new_elt->next = first;
if (__gthread_active_p ())
__gthread_setspecific (key, new_elt);
else
single_thread = new_elt;
return 0;
}
Similar issue reported at:
https://github.com/cameron314/concurrentqueue/issues/152
and the upstream bug report, which I think this issue can be closed in favour of as "fixed in upstream".
https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg551653.html https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83029
We aslo meet this memory leak error with ASAN check:
#ASAN_OPTIONS=fast_unwind_on_malloc=true,detect_stack_use_after_return=1,detect_odr_violation=2,detect_container_overflow=1,log_path=stderr,new_delete_type_mismatch=1,alloc_dealloc_mismatch=1,suppressions=/v/asan_conf/asan.supp ./column_family_test --gtest_filter=FormatDef/ColumnFamilyTest.FlushTest/0
Note: Google Test filter = FormatDef/ColumnFamilyTest.FlushTest/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FormatDef/ColumnFamilyTest
[ RUN ] FormatDef/ColumnFamilyTest.FlushTest/0
[ OK ] FormatDef/ColumnFamilyTest.FlushTest/0 (2983 ms)
[----------] 1 test from FormatDef/ColumnFamilyTest (2984 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (2986 ms total)
[ PASSED ] 1 test.
=================================================================
==24675==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7fb5d4d6be9f in operator new(unsigned long, std::nothrow_t const&) (/lib64/libasan.so.5+0x10fe9f)
#1 0x7fb5d1638135 in __cxa_thread_atexit (/lib64/libstdc++.so.6+0xa5135)
SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).
We upgraded to 6.2.4 recently and have started seeing leaks reported out of __cxa_thread_atexit on linux. These leaks are reproducible with db_bench.
Expected behavior
No leaks reported by Valgrind or AddressSanitizer
Actual behavior
Leaks reported by Valgrind and AddressSanitizer
Steps to reproduce the behavior:
Dockerfile to build rocksdb and valgrind
db_bench command:
leaks reported:
full output