iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.58k stars 3.88k forks source link

BPF pre-processor for command line options #2129

Open brendangregg opened 5 years ago

brendangregg commented 5 years ago

Tools currently use string substitution to implement different BPF code for different command line options. We're pushing it a bit too far in some tools, eg, opensnoop:

    PID_TID_FILTER
    UID_FILTER
    FLAGS_FILTER
[...]
if args.tid:  # TID trumps PID
    bpf_text = bpf_text.replace('PID_TID_FILTER',
        'if (tid != %s) { return 0; }' % args.tid)
elif args.pid:
    bpf_text = bpf_text.replace('PID_TID_FILTER',
        'if (pid != %s) { return 0; }' % args.pid)
else:
    bpf_text = bpf_text.replace('PID_TID_FILTER', '')
if args.uid:
    bpf_text = bpf_text.replace('UID_FILTER',
        'if (uid != %s) { return 0; }' % args.uid)
else:
    bpf_text = bpf_text.replace('UID_FILTER', '')
if args.flag_filter:
    bpf_text = bpf_text.replace('FLAGS_FILTER',
        'if (!(flags & %d)) { return 0; }' % flag_filter_mask)
else:
    bpf_text = bpf_text.replace('FLAGS_FILTER', '')
if not (args.extended_fields or args.flag_filter):
    bpf_text = '\n'.join(x for x in bpf_text.split('\n')
        if 'EXTENDED_STRUCT_MEMBER' not in x)

This wasn't too bad for simple PID filters, but is getting messier, especially if we start having more -e extended field modes, where we want to have a perf output struct that also varies based on command line options. This ticket is to discuss a better way.

@vincentbernat once suggested (3 years ago) that I look at jinja2. I don't know it, but it looks interesting: https://github.com/vincentbernat/systemtap-cookbook/blob/116500e65c42fe2f9b5cea4f42961db9e26685b9/io#L239-L315

Resulting code has things like:

{%- if options.milliseconds %}
    delta /= 1000;
{%- endif %}

Any other such libraries?

I think someone needs to pick a tool (say, opensnoop), and convert it to use these libraries so we can see if it's an improvement or not.

palmtenor commented 5 years ago

I agree with your observation. Even if we use new library, at least these could use CFlags to make things nicer.

oriolarcas commented 5 years ago

In one of my projects I implemented a pseudo-macro preprocessor/Python helpers. I use it to generate C code from macros like MEMCPY(dst,src,len) in the BPF program. Some support for it would be great.

mscastanho commented 5 years ago

Hi everyone, I saw this issue and thought I could help.

As suggested by @brendangregg, I modified the opensnoop.py tool to use jinja2 templates instead of replace(). Now the bpf_text string has some jinja2 templating directives, which are evaluated when the render() method is called.

To test the code I used the --ebpf flag and compared the outputs from the tool before and after the changes. The test cases were the examples given on the file and some other cases I thought might break the code. In all cases the eBPF output code is the same. I can share the test script if you want.

However, using jinja2 adds an external dependency, which needs to be installed via pip, not sure how to package this.

brendangregg commented 5 years ago

However, using jinja2 adds an external dependency

Ah, that would hurt distribution. So either jinja2 becomes a default python library, or we use something else that is shipped with python by default.

vincentbernat commented 5 years ago

Python's stdlib doesn't come with any useful templating module. However, Jinja2 is available in all distributions as it is quite universal (a bit like "requests"), even very old ones (it's in RHEL 6 for example) as it's a dependency of ansible, salt and cloud-init.