saBPF-project / provbpf

GNU General Public License v2.0
3 stars 1 forks source link

Memory leak #5

Open tfjmp opened 3 years ago

tfjmp commented 3 years ago

There is a memory leak on https://github.com/tfjmp/camflow-bpf/commit/baf99db4a52ad450c9b679cb71ea6e0c16604220

This Bogdan's branch head as of January 7th.

How to reproduce?

Build and install. Reboot. The service should be running.

Execute the following command spaced over time: sudo systemctl status provbpfd.service

You should notice that the memory increase monotonically.

What to look for?

I looked through the code, there is no obvious alloc not followed by a free.

Test with the different output format: SPADE and W3C. Is it caused in a bug in one of the serializations? Test with the null output? Is it caused by a bug in how disk writing is being handled? etc.

tfjmp commented 3 years ago

I modified the makefile to run valgrind easily (https://github.com/tfjmp/camflow-bpf/commit/f66662f8d0e41f347e43186a0bbfd79f57a13dab).

Assuming provbpf has been installed following readme instruction. Do:

make stop
make run_valgrind

This should run dynamic memory testing stuff. There seem to be some issues with:

Though I did not see anything relating to a memory leak on a quick check.

If @s00y33 or @bstelea want to do some fixing + some further investigation do feel free to do so.

tfjmp commented 3 years ago

Some examples for record keeping sake.

BPF syscall:

==24571==    at 0x49D730D: syscall (in /usr/lib64/libc-2.32.so)
==24571==    by 0x485BA6D: sys_bpf (bpf.c:65)
==24571==    by 0x485BA6D: bpf_raw_tracepoint_open (bpf.c:898)
==24571==    by 0x4869124: bpf_program__attach_btf_id (libbpf.c:9884)
==24571==    by 0x48741A4: bpf_object__attach_skeleton (libbpf.c:11142)
==24571==    by 0x403247: bpf_camflow_kern__attach (bpf_camflow.skel.h:195)
==24571==    by 0x402D69: main (bpf_camflow_usr.c:181)
==24571==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

addr_to_json:

==24571== Conditional jump or move depends on uninitialised value(s)
==24571==    at 0x483CA75: strncat (vg_replace_strmem.c:349)
==24571==    by 0x489C33B: __add_json_attribute (provenanceW3CJSON.c:321)
==24571==    by 0x489C33B: addr_to_json (provenanceW3CJSON.c:676)
==24571==    by 0x404B94: w3c_address (camflow_bpf_record.c:122)
==24571==    by 0x405763: long_prov_record (camflow_bpf_record.c:375)
==24571==    by 0x405D35: bpf_prov_record (camflow_bpf_record.c:496)
==24571==    by 0x403274: buf_process_entry (bpf_camflow_usr.c:34)
==24571==    by 0x487959D: ringbuf_process_ring (ringbuf.c:229)
==24571==    by 0x4879BF3: ring_buffer__poll (ringbuf.c:278)
==24571==    by 0x403113: main (bpf_camflow_usr.c:251)

flush_json:

==24571== Syscall param write(buf) points to uninitialised byte(s)
==24571==    at 0x48C7027: write (in /usr/lib64/libpthread-2.32.so)
==24571==    by 0x405BE4: log_to_file (camflow_bpf_record.c:421)
==24571==    by 0x48978B0: flush_json (provenanceW3CJSON.c:221)
==24571==    by 0x48978B0: flush_json (provenanceW3CJSON.c:206)
==24571==    by 0x4897E0C: json_append (provenanceW3CJSON.c:234)
==24571==    by 0x4897E0C: append_activity (provenanceW3CJSON.c:243)
==24571==    by 0x404A1C: w3c_task (camflow_bpf_record.c:90)
==24571==    by 0x4054D2: node_record (camflow_bpf_record.c:328)
==24571==    by 0x405D46: bpf_prov_record (camflow_bpf_record.c:498)
==24571==    by 0x403274: buf_process_entry (bpf_camflow_usr.c:34)
==24571==    by 0x487959D: ringbuf_process_ring (ringbuf.c:229)
==24571==    by 0x4879BF3: ring_buffer__poll (ringbuf.c:278)
==24571==    by 0x403113: main (bpf_camflow_usr.c:251)
==24571==  Address 0x64870f4 is 12,900 bytes inside a block of size 21,993 alloc'd
==24571==    at 0x4839809: malloc (vg_replace_malloc.c:307)
==24571==    by 0x4897496: ready_to_print (provenanceW3CJSON.c:180)
==24571==    by 0x4897496: flush_json (provenanceW3CJSON.c:219)
==24571==    by 0x4897496: flush_json (provenanceW3CJSON.c:206)
==24571==    by 0x4897E0C: json_append (provenanceW3CJSON.c:234)
==24571==    by 0x4897E0C: append_activity (provenanceW3CJSON.c:243)
==24571==    by 0x404A1C: w3c_task (camflow_bpf_record.c:90)
==24571==    by 0x4054D2: node_record (camflow_bpf_record.c:328)
==24571==    by 0x405D46: bpf_prov_record (camflow_bpf_record.c:498)
==24571==    by 0x403274: buf_process_entry (bpf_camflow_usr.c:34)
==24571==    by 0x487959D: ringbuf_process_ring (ringbuf.c:229)
==24571==    by 0x4879BF3: ring_buffer__poll (ringbuf.c:278)
==24571==    by 0x403113: main (bpf_camflow_usr.c:251)

leak summary (apparently nothing):

==24714==
==24714== LEAK SUMMARY:
==24714==    definitely lost: 0 bytes in 0 blocks
==24714==    indirectly lost: 0 bytes in 0 blocks
==24714==      possibly lost: 0 bytes in 0 blocks
==24714==    still reachable: 102,406 bytes in 282 blocks
==24714==         suppressed: 0 bytes in 0 blocks
==24714==
tfjmp commented 3 years ago

Fixed issue around json_append in libprovenance see: https://github.com/CamFlow/libprovenance/commit/7d2715e0f912c8fe63ddd7c553c3a20dce4b7731

tfjmp commented 3 years ago

Ok memory leak does not seem to be from user space (or at least does not seem obvious).

@bstelea @s00y33 : could you check that kernel objects that get "allocated" prov structure on map get removed? i.e. free_x alloc_x do something?