Closed brendangregg closed 8 years ago
Here's a recent use case: a process leaks slowly, eating 1 Gbyte in 24 hours. It's likely a single bug in a single code path. If we randomly only tracked 10% of the allocations, it shouldn't take long for a trend to be noticeable: that single code path. Maybe even 1% would be sufficient. Of course, "sampling" in this way is only worth it if it actually reduces overhead.
100x is pretty high, but I guess not too surprising if this was a user-level program, and the implementation of uprobes (involving, I assume, mode switching to the kernel). I wonder if there's an easy way to do a kernel stress test. ... (Now, in the distant future, we could explore a user-level eBPF engine, that could massively reduce user-level tracing overheads. LTTng does this. In the meantime, we'll have to make do with what uprobes does.)
Yep, not too surprising -- I'd assume malloc(16)
to be pretty much an instantaneous dequeue from a thread-local list, so the breakpoint + mode switch + probe execution + mode switch dominates the running time.
To measure kernel alloc overhead, we could build a simple .ko that calls kmalloc
/kfree
repeatedly and prints the allocation rate periodically, and then attach memleak.py. I'll try doing that after running some user-level benchmarks (e.g. compiling with gcc with malloc probes attached, etc.).
As far as I can tell, BPF_FUNC_get_prandom_u32
is not supported by bcc yet. I might be missing something because I'm not very familiar with the code base, so I'd appreciate a pointer. I can of course do a modulus on the timestamp in the meantime.
Alexei just added a missing bcc bit so we get BPF_FUNC_get_prandom_u32, although timestamps may really be sufficient.
Add a tool that show stack traces and sizes of "old" allocations that haven't been freed after a threshold.
Continuation of #328 discussion