iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.43k stars 3.86k forks source link

bcc does not pick the right libraries #525

Open rnav opened 8 years ago

rnav commented 8 years ago

bcc is not considering the encoded hwcap when choosing libraries. As such, uprobes on a shared library does not work on powerpc, as seen with the uprobes test:

======================================================================
FAIL: test_simple_library (__main__.TestUprobes)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcc/tests/python/test_uprobes.py", line 34, in test_simple_library
    self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
AssertionError: 0L != 2

----------------------------------------------------------------------

libc libraries in cache:

# ldconfig -p | grep libc.so
    libc.so.6 (libc6,64bit, hwcap: 0x0000200000000000, OS ABI: Linux 2.6.32) => /lib64/power8/libc.so.6
    libc.so.6 (libc6,64bit, OS ABI: Linux 2.6.32) => /lib64/libc.so.6

bcc always picks the first library here, which won't work on non-power8 machines.

We need to either implement stricter checks (look at hwcap and perhaps the platform) or consider probing on all libraries with the same name.

drzaeus77 commented 8 years ago

Not sure I understand the full issue or suggestion here. As it is now, we're just using standard gcc logic when compiling.

If you run make VERBOSE=1, can you capture the line where libbcc.so is linked, and make some modifications that cause it to work for you? If you could for instance identify a problematic gcc line, we could help in working that into the build definitions.

rnav commented 8 years ago

Oh, I probably should have explained better. The problem actually shows up at runtime and not while building bcc itself. In this case, it is with test_uprobes.py:

# /root/bcc/build/tests/wrapper.sh "py_uprobes" "sudo" "/root/bcc/tests/python/test_uprobes.py"
Python 2.7.5
.Arena 0:
system bytes     =   12648448
in use bytes     =    3013088
Total (incl. mmap):
system bytes     =   16449536
in use bytes     =    6814176
max mmap regions =         12
max mmap bytes   =    4456448
F
======================================================================
FAIL: test_simple_library (__main__.TestUprobes)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcc/tests/python/test_uprobes.py", line 34, in test_simple_library
    self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
AssertionError: 0L != 2

----------------------------------------------------------------------
Ran 2 tests in 0.510s

FAILED (failures=1)
Failed

In test_uprobes.py, we use the below line to place a probe at malloc_stats() in libc:

b.attach_uprobe(name="c", sym="malloc_stats", fn_name="count")

This triggers a search for libc, which on powerpc ends up picking the wrong library to place the probe (/lib64/power8/libc.so.6 rather than /lib64/libc.so.6). As such, the probe never fires.

The reason we pick the wrong library is because we are not considering the hwcap associated with the library.

drzaeus77 commented 8 years ago

Yes, thanks for explaining, that certainly makes more sense! Actually the problem seems obvious in retrospect but its been a busy morning so far :)

rnav commented 8 years ago

Sure. It looks like @vmg wrote much of this code. @vmg do you have ideas on how best to address this?

mprzybylski commented 7 years ago

Just ran into this issue while building / testing on Debian 8 amd64.

mikep@mv-tricolor:~/bcc/obj-x86_64-linux-gnu$ sudo /usr/bin/ctest --force-new-ctest-process -j1 -V

...

20: Test command: /home/mikep/bcc/obj-x86_64-linux-gnu/tests/wrapper.sh "py_uprobes" "sudo" "/home/mikep/bcc/tests/python/test_uprobes.py"
20: Test timeout computed to be: 9.99988e+06
20: Python 2.7.9
20: .Arena 0:
20: system bytes     =   13799424
20: in use bytes     =    2969696
20: Total (incl. mmap):
20: system bytes     =   14589952
20: in use bytes     =    3760224
20: max mmap regions =          4
20: max mmap bytes   =    1589248
20: F
20: ======================================================================
20: FAIL: test_simple_library (__main__.TestUprobes)
20: ----------------------------------------------------------------------
20: Traceback (most recent call last):
20:   File "/home/mikep/bcc/tests/python/test_uprobes.py", line 34, in test_simple_library
20:     self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
20: AssertionError: 0L != 2
20: 
20: ----------------------------------------------------------------------
20: Ran 2 tests in 0.217s
20: 
20: FAILED (failures=1)
20: Failed
20/28 Test #20: py_uprobes .......................***Failed    0.29 sec

Looks like malloc_stats() is missing from this distro and version.

mikep@mv-tricolor:~$ python
Python 2.7.9 (default, Jun 29 2016, 13:08:31) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes.util import find_library
>>> find_library('malloc_stats')
>>> find_library('c')
'libc.so.6'

@rnav, would you mind doing the same check with python and find_library() on the host where you discovered this issue?

mprzybylski commented 7 years ago

Ugh. That's not it either...

mikep@mv-tricolor:~$ nm -D --defined-only /lib/x86_64-linux-gnu/libc.so.6
...
000000000007ddb0 W malloc_stats
...

It's definitely defined, and after reading the man page for malloc_stats(), its pretty clear that it did run and this was its output:

20: Python 2.7.9
20: .Arena 0:
20: system bytes     =   13799424
20: in use bytes     =    2969696
20: Total (incl. mmap):
20: system bytes     =   14589952
20: in use bytes     =    3760224
20: max mmap regions =          4
20: max mmap bytes   =    1589248

So something about that bpf probe didn't fire correctly...

rnav commented 7 years ago

@mprzybylski you can put in a sleep() in tests/python/test_uprobes.py and check /sys/kernel/debug/tracing/uprobe_events to see which library is being picked by bcc.

rnav commented 7 years ago

I thought of using the aux vector to figure out the hardware capabilities before picking up the right library, but with 32-bit and 64-bit libraries, that may not be enough. Perhaps we should just put a probe on all matching libraries?

pchaigno commented 7 years ago

We discussed the same bug in #853. Sorry I didn't drop a note here before.

875 is a first pull request to improve this situation (I plan to update it this week-end). It tries to attach to the library that's currently loaded into the target process, if one is given.

I also started working on the second strategy discussed in #853 (using the running architecture of the bcc process to help select the appropriate library), but I don't have much time, so it might take longer. If anyone else wants to take care of that one, please go ahead :smiley:

mprzybylski commented 7 years ago

@rnav, sure enough, this is general problem for multiarch platforms. Thanks for pointing me in the right direction.

root@mv-tricolor:/sys/kernel/debug/tracing# cat uprobe_events 
p:uprobes/p__libx32_libc_so_6_0x76f30 /libx32/libc.so.6:0x0000000000076f30
r:uprobes/r__libx32_libc_so_6_0x76f30 /libx32/libc.so.6:0x0000000000076f30
mprzybylski commented 7 years ago

@pchaigno, Nice work on #875.

I just patched tests/python/test_uprobes.py to take advantage of it and got those tests to pass. I'll submit a pull request with that, and a few other things soon...

finelli commented 7 years ago

Hi @pchaigno which patches should be applied to make the test pass ? Right now on a Debian Jessie the build hangs on the py_uprobes test.

pchaigno commented 7 years ago

@finelli You shouldn't need any patch for the tests to pass. What makes you think it's related to this issue?

Karm77ii commented 2 years ago

Products

<div *ngFor="let product of products">

MonarchiaLIMMAVMFEAR commented 2 years ago

`MiniO