baidu / braft

An industrial-grade C++ implementation of RAFT consensus algorithm based on brpc, widely used inside Baidu to build highly-available distributed systems.
Apache License 2.0
3.99k stars 886 forks source link

braft memory leak #328

Open fsindustry opened 3 years ago

fsindustry commented 3 years ago

braft运行一段时间后,内存会缓慢增长,通过valgrind运行,报内存泄露; 环境信息如下: braft版本:1.1.0 brpc版本:0.9.7 操作系统:centos7.6 gcc版本:4.8.5

valgrind日志如下: ==58110== 2,532,770,153 bytes in 56,781 blocks are indirectly lost in loss record 2,614 of 2,615 ==58110== at 0x4C2A089: calloc (vg_replace_malloc.c:762) ==58110== by 0xF3912B: my_raw_malloc (my_malloc.c:201) ==58110== by 0xF3912B: my_malloc (my_malloc.c:66) ==58110== by 0x7C826304: ??? ==58110== by 0x753B568: braft::FSMCaller::do_committed(long) (fsm_caller.cpp:292) ==58110== by 0x753C23E: braft::FSMCaller::run(void, bthread::TaskIterator&) (fsm_caller.cpp:126) ==58110== by 0x64D01EC: bthread::ExecutionQueueBase::_execute(bthread::TaskNode, bool, int) (execution_queue.cpp:272) ==58110== by 0x64D1CCF: bthread::ExecutionQueueBase::_execute_tasks(void) (execution_queue.cpp:151) ==58110== by 0x64E4E59: bthread::TaskGroup::task_runner(long) (task_group.cpp:295) ==58110== by 0x64D8E50: bthread_make_fcontext (in /usr/local/lib/libbrpc.so) ==58110== ==58110== 2,535,048,593 (2,278,200 direct, 2,532,770,393 indirect) bytes in 56,955 blocks are definitely lost in loss record 2,615 of 2,615 ==58110== at 0x4C28593: operator new(unsigned long) (vg_replace_malloc.c:344) ==58110== by 0x7C8262B0: ??? ==58110== by 0x753B568: braft::FSMCaller::do_committed(long) (fsm_caller.cpp:292) ==58110== by 0x753C23E: braft::FSMCaller::run(void, bthread::TaskIterator&) (fsm_caller.cpp:126) ==58110== by 0x64D01EC: bthread::ExecutionQueueBase::_execute(bthread::TaskNode, bool, int) (execution_queue.cpp:272) ==58110== by 0x64D1CCF: bthread::ExecutionQueueBase::_execute_tasks(void) (execution_queue.cpp:151) ==58110== by 0x64E4E59: bthread::TaskGroup::task_runner(long) (task_group.cpp:295) ==58110== by 0x64D8E50: bthread_make_fcontext (in /usr/local/lib/libbrpc.so)

PFZheng commented 3 years ago

需要确认下缺失的那一行在干什么(==58110== by 0x7C8262B0: ???),do_committed 主要的工作是回调用户提供的 on_apply 接口

fsindustry commented 2 years ago

修复了on_apply回调中几个内存非法访问的bug后,valgrind不报泄漏了。但是内存仍然在缓慢增长,使用pprof分析堆栈文件,报泄漏信息如下:都是bvar::detail::AgentGroup::_s_tls_blocks这里报的泄漏。 Leak of 222288 bytes in 2526 objects allocated from: @ 7fdc2d4284ac unknown @ 00007fdc2d4659d1 Perl_yyparse ??:0 @ 00007fdc2d44097e perl_parse ??:0 @ 0000000000400c7a bvar::detail::AgentGroup::_s_tls_blocks ??:0 @ 00007fdc2c0b4554 libc_start_main ??:0 @ 0000000000400d40 bvar::detail::AgentGroup::_s_tls_blocks ??:0 Leak of 110800 bytes in 2770 objects allocated from: @ 7fdc2d428053 unknown @ 00007fdc2d452fec Perl_yylex ??:0 @ 00007fdc2d4638df Perl_yyparse ??:0 @ 00007fdc2d44097e perl_parse ??:0 @ 0000000000400c7a bvar::detail::AgentGroup::_s_tls_blocks ??:0 @ 00007fdc2c0b4554 libc_start_main ??:0 @ 0000000000400d40 bvar::detail::AgentGroup::_s_tls_blocks ??:0 Leak of 108504 bytes in 1233 objects allocated from: @ 7fdc2d4284ac unknown @ 00007fdc2d4659d1 Perl_yyparse ??:0 @ 00007fdc2d4de499 Perl_cx_dump ??:0 @ 00007fdc2d4e9c15 Perl_pp_require ??:0 @ 00007fdc2d4a4e65 Perl_runops_standard ??:0 @ 00007fdc2d43c257 Perl_call_sv ??:0 @ 00007fdc2d43d1cb Perl_call_list ??:0 @ 00007fdc2d4248df _init ??:0 @ 00007fdc2d4368c7 Perl_newATTRSUB_flags ??:0 @ 00007fdc2d43719f Perl_newATTRSUB ??:0 @ 00007fdc2d437608 Perl_utilize ??:0 @ 00007fdc2d4657f5 Perl_yyparse ??:0 @ 00007fdc2d44097e perl_parse ??:0 @ 0000000000400c7a bvar::detail::AgentGroup::_s_tls_blocks ??:0 @ 00007fdc2c0b4554 libc_start_main ??:0 @ 0000000000400d40 bvar::detail::AgentGroup::_s_tls_blocks ??:0 Leak of 75780 bytes in 2526 objects allocated from: @ 7fdc2d486d73 unknown @ 00007fdc2d4285c3 Perl_newSTATEOP ??:0 @ 00007fdc2d4659d1 Perl_yyparse ??:0 @ 00007fdc2d44097e perl_parse ??:0 @ 0000000000400c7a bvar::detail::AgentGroup::_s_tls_blocks ??:0 @ 00007fdc2c0b4554 libc_start_main ??:0 @ 0000000000400d40 bvar::detail::AgentGroup::_s_tls_blocks ??:0 Leak of 63224 bytes in 1129 objects allocated from: @ 7fdc2d42938d unknown @ 00007fdc2d4640c5 Perl_yyparse ??:0 @ 00007fdc2d44097e perl_parse ??:0 @ 0000000000400c7a bvar::detail::AgentGroup::_s_tls_blocks ??:0 @ 00007fdc2c0b4554 __libc_start_main ??:0 @ 0000000000400d40 bvar::detail::AgentGroup::_s_tls_blocks ??:0

image

请帮忙看下是否是braft / brpc存在内存泄漏的?