Closed danielocfb closed 3 weeks ago
cc @jfernandez
Very interested in this work, happy to review and test any PRs! :)
This is now out for review https://github.com/libbpf/blazesym/pull/854
@javierhonduco feel free to try it out and report back. Also, let me know if you have any questions.
Great stuff @danielocfb! Thanks for the heads up! This week I won't have too much time to take a look a this, but will make sure to do it early next week.
As part of the effort of improving our kernel symbolization logic, we would like to support symbolization of addresses mapping to BPF programs. Here is a brain dump roughly outlining what (I think) is necessary to support such symbolization. Everything and anything could be wrong ...
/proc/<pid>/maps
/PROCMAP_QUERY
BPF programs would be represented with a "name" ofbpf_prog_<some-hex-number>
;some-hex-number
seems to be the program's "tag" and can be used for finding more informationbpf_prog_get_next_id
to iterate over loaded programs and find the one with matching the tagbpf_prog_get_fd_by_id
to retrieve program file descriptorbpf_obj_get_info_by_fd
to retrieve program information using said file descriptorbpf_prog_info.{nr_jited_line_info, jited_line_info, line_info_rec_size}
and similar, in conjunction with the kernel's BTF information, to retrieve function name and source code pathIn terms of integration into
blazesym
, a good starting point to look at would probably be https://github.com/libbpf/blazesym/blob/40a46a48fd83be25b6b32b3401837a8907d23301/src/symbolize/symbolizer.rs#L896What data to cache (and at what level) is somewhat of an open question. At the very least I'd say we should be remembering the result for a given address and reuse that on repeated symbolization. But perhaps a more coarse grained approach (e.g., caching at the function level, if there is such a thing, or remembering what BPF program maps to what tag) may be useful as well. I have no idea of performance characteristics of any of the APIs we need to interface with.
As I mentioned above, I think we may need some basic BTF support (mostly for string lookup?) as well as BPF syscall bindings. Usage of
libbpf-rs
is a possibility (should contain both), though I don't know if we really want to add a dependency tolibbpf-rs
andlibbpf
longer term. But we can think about that once a POC is working.We would also require some prerequisite work introducing proper kernel testing infrastructure to be able to test this symbolization on injected programs as well. At this point I think it mostly comes down to loading BPF programs, as we already support testing on arbitrary kernels using
vmtest
. Again, this should be provided bylibbpf-rs
, which I think is a no brainer to use in a testing context.