iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.36k stars 3.86k forks source link

tools/memleak: Fix data race on --combined-only #4869

Closed Bojun-Seo closed 8 months ago

Bojun-Seo commented 8 months ago

combined_allocs map can be modified by multiple threads and it causes data race easily. To prevent this issue, use atomic operation, like libbpf-tools/memleak.bpf.c does. This atomic implementation cannot resolve all data race cases but reduce some.

Followings are the way to generate data race issue and the result after applying this patch.

File test.cpp

  #include <iostream>
  #include <thread>
  #include <unistd.h>

  void alloc() {
    for (int i = 0; i < 100000; ++i) {
      int* a = (int*)malloc(4);
    }
  }

  int main() {
    sleep(100);
    std::thread t1 {&alloc};
    std::thread t2 {&alloc};
    t1.join();
    t2.join();
    sleep(50);
    return 0;
  }

Build the test file

  $ g++ test.cpp

Run test program and memleak.py without --combined-only

  $ ./a.out &
  [1] 65168
  $ sudo python3 memleak.py -p 65168
  Attaching to pid 65168, Ctrl+C to quit.
  [11:52:57] Top 10 stacks with outstanding allocations:
  ...snip...
  [11:54:09] Top 10 stacks with outstanding allocations:
        576 bytes in 2 allocations from stack
                0x00007ff3f6bba7da      _dl_allocate_tls+0x3a [ld-linux-x86-64.so.2]
        799992 bytes in 199998 allocations from stack
                0x000055bc2bff12c8      alloc()+0x1f [a.out]
                0x000055bc2bff1bd3      void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)())+0x21 [a.out]
                0x000055bc2bff1b91      std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)())+0x24 [a.out]
                0x000055bc2bff1b32      void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2c [a.out]
                0x000055bc2bff1b02      std::thread::_Invoker<std::tuple<void (*)()> >::operator()()+0x1c [a.out]
                0x000055bc2bff1ae2      std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run()+0x20 [a.out]
                0x00007ff3f68dc253      [unknown] [libstdc++.so.6.0.30]
        16785408 bytes in 2 allocations from stack
                0x00007ff3f6495707      pthread_create+0xac7 [libc.so.6]
        134217728 bytes in 1 allocations from stack
                0x00007ff3f64a1c82      alloc_new_heap+0xc2 [libc.so.6]

Run test program and memleak.py with --combined-only

  $ ./a.out &
  [1] 64183
  [11:40:56] Top 10 stacks with outstanding allocations:
  ...snip...
  [11:42:06] Top 10 stacks with outstanding allocations:
        0 bytes in 0 allocations from stack
                operator new(unsigned long)+0x1c [libstdc++.so.6.0.30]
                main+0x44 [a.out]
                __libc_start_call_main+0x80 [libc.so.6]
        0 bytes in 0 allocations from stack
                operator new(unsigned long)+0x1c [libstdc++.so.6.0.30]
                main+0x62 [a.out]
                __libc_start_call_main+0x80 [libc.so.6]
        576 bytes in 2 allocations from stack
                _dl_allocate_tls+0x3a [ld-linux-x86-64.so.2]
        664816 bytes in 166204 allocations from stack
                alloc()+0x1f [a.out]
                void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)())+0x21 [a.out]
                std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)())+0x24 [a.out]
                void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2c [a.out]
                std::thread::_Invoker<std::tuple<void (*)()> >::operator()()+0x1c [a.out]
                std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run()+0x20 [a.out]
                [unknown] [libstdc++.so.6.0.30]
        16785408 bytes in 2 allocations from stack
                pthread_create+0xac7 [libc.so.6]
        134217728 bytes in 1 allocations from stack
                alloc_new_heap+0xc2 [libc.so.6]

The count value difference between 199998 and 166204 shows the data race issue.

After Applying this patch, run memleak.py with --combined-only

  $ ./a.out &
  [1] 63828
  [11:36:31] Top 10 stacks with outstanding allocations:
  ...snip...
  [11:37:59] Top 10 stacks with outstanding allocations:
        576 bytes in 2 allocations from stack
                _dl_allocate_tls+0x3a [ld-linux-x86-64.so.2]
        799992 bytes in 199998 allocations from stack
                alloc()+0x1f [a.out]
                void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)())+0x21 [a.out]
                std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)())+0x24 [a.out]
                void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>)+0x2c [a.out]
                std::thread::_Invoker<std::tuple<void (*)()> >::operator()()+0x1c [a.out]
                std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run()+0x20 [a.out]
                [unknown] [libstdc++.so.6.0.30]
        16785408 bytes in 2 allocations from stack
                pthread_create+0xac7 [libc.so.6]
        134217728 bytes in 1 allocations from stack
                alloc_new_heap+0xc2 [libc.so.6]
yonghong-song commented 8 months ago

Thanks. LGTM.