retis-org / retis

Tracing packets in the Linux networking stack & friends
https://retis.readthedocs.io/en/stable/
100 stars 14 forks source link

Move skb tracking into the core & use it for filtering & reworked the probe manager #118

Closed atenart closed 1 year ago

atenart commented 1 year ago

Move the skb tracking logic into the core and use it for filtering. Not a lot of additions, except for the filtering part. Otherwise it's mostly code moving, which is intended for now. We can add Rust helpers later if needed by some modules.

Using the skb tracking logic in the filtering part has two main advantages:

[...]
# ./retis collect -f 'udp port 53 and src host 10.0.42.2' -c skb,skb-drop --skb-sections all \
    -p kprobe:udp_rcv -p kprobe:ip_rcv -p kprobe:ip_finish_output2 -p kprobe:napi_gro_receive \
    -p kprobe:inet_gro_receive -p kprobe:udp_gro_receive &
# ip netns exec ns1 dig @1.1.1.1 redhat.com
{"common":{"timestamp":610794283209},"kernel":{"probe_type":"kprobe","symbol":"ip_rcv"},"skb":{"cloned":true,"daddr":"1.1.1.1","dataref":1,"dev_name":"veth01","dport":53,"dst":"1e:9c:f0:29:be:69","etype":2048,"fclone":false,"ifindex":19,"ip_version":4,"l3_len":79,"netns":4026531840,"protocol":17,"rx_ifindex":19,"saddr":"10.0.42.2","sport":45070,"src":"62:13:b1:87:e1:b0","udp_len":59,"users":1},"skb-tracking":{"orig_head":18446625427471709696,"skb":18446625427490994944,"timestamp":610794283209}}
{"common":{"timestamp":610794304627},"kernel":{"probe_type":"kprobe","symbol":"pskb_expand_head"},"skb":{"cloned":true,"daddr":"1.1.1.1","dataref":1,"dev_name":"veth01","dport":53,"dst":"1e:9c:f0:29:be:69","etype":2048,"fclone":false,"ifindex":19,"ip_version":4,"l3_len":79,"netns":4026531840,"protocol":17,"rx_ifindex":19,"saddr":"10.0.42.2","sport":45070,"src":"62:13:b1:87:e1:b0","udp_len":59,"users":1},"skb-tracking":{"orig_head":18446625427471709696,"skb":18446625427490994944,"timestamp":610794283209}}
{"common":{"timestamp":610794309230},"kernel":{"probe_type":"kprobe","symbol":"skb_free_head"},"skb":{"cloned":true,"daddr":"1.1.1.1","dataref":1,"dev_name":"veth01","dport":53,"dst":"1e:9c:f0:29:be:69","etype":2048,"fclone":false,"ifindex":19,"ip_version":4,"l3_len":79,"netns":4026531840,"protocol":17,"rx_ifindex":19,"saddr":"10.0.42.2","sport":45070,"src":"62:13:b1:87:e1:b0","udp_len":59,"users":1},"skb-tracking":{"orig_head":18446625427471709696,"skb":18446625427490994944,"timestamp":610794283209}}
{"common":{"timestamp":610794328206},"kernel":{"probe_type":"kprobe","symbol":"ip_finish_output2"},"skb":{"cloned":false,"daddr":"1.1.1.1","dataref":1,"dev_name":"eth0","dport":53,"dst":"1e:9c:f0:29:be:69","etype":2048,"fclone":false,"ifindex":2,"ip_version":4,"l3_len":79,"netns":4026531840,"protocol":17,"rx_ifindex":19,"saddr":"172.16.42.90","sport":45070,"src":"62:13:b1:87:e1:b0","udp_len":59,"users":1},"skb-tracking":{"orig_head":18446625427471709696,"skb":18446625427490994944,"timestamp":610794283209}}
{"common":{"timestamp":610794731861},"kernel":{"probe_type":"kprobe","symbol":"skb_free_head"},"skb":{"cloned":false,"daddr":"1.1.1.1","dataref":1,"dev_name":"eth0","dport":53,"dst":"52:54:00:bc:4e:4a","etype":2048,"fclone":false,"ifindex":2,"ip_version":4,"l3_len":79,"netns":4026531840,"protocol":17,"rx_ifindex":19,"saddr":"172.16.42.90","sport":45070,"src":"52:54:00:0b:99:9e","udp_len":59,"users":1},"skb-tracking":{"orig_head":18446625427471709696,"skb":18446625427490994944,"timestamp":610794283209}}

saddr does not match the filter in the last two events, because the packet was NATed.

Based on #102.

atenart commented 1 year ago

Maybe skb_tracking.rs can be now renamed to skb.rs?

I kept this name because of our discussions on having a clear name for our text editors (skb.rs could be confusing with the skb module). At least for me, this doesn't change much so I can rename this, but you might want to keep it this way? :)

amorenoz commented 1 year ago

While testing this in combination with OVS module I found a slightly weird behavior:

Use case: I want to debug some OVS issue so I prepare my test and before running it I type: retis collect -c ovs -f "tcp" what should I expect?

What I expect: Only ovs module enabled so I expect only ovs-related events. Also, I only want TCP packets.

What I see: Hundreds of events from all the TCP packets being freed and cloned.

So my thoughts here: While moving skb tracking to the core is the right thing to do, I would make skb-tracking events an opt-in thing. One approach is to have a global flag that edits some config or rodata. Another approach is to split the probe in two, move the tracking logic to the core and keep a module called "skb-tracking" that just sends the events.

What do you think?

atenart commented 1 year ago

While testing this in combination with OVS module I found a slightly weird behavior:

Use case: I want to debug some OVS issue so I prepare my test and before running it I type: retis collect -c ovs -f "tcp" what should I expect?

What I expect: Only ovs module enabled so I expect only ovs-related events. Also, I only want TCP packets.

What I see: Hundreds of events from all the TCP packets being freed and cloned.

So my thoughts here: While moving skb tracking to the core is the right thing to do, I would make skb-tracking events an opt-in thing. One approach is to have a global flag that edits some config or rodata. Another approach is to split the probe in two, move the tracking logic to the core and keep a module called "skb-tracking" that just sends the events.

What do you think?

IMHO the proper way to solve this is to embed a switch in Probe (probably in KernelProbe::config) to silent a probe. This could be set by some modules/core probes when a probe is not meant to report events. Then in the manager we could override the setting as soon as a probe on the same function do ask for an event (default value).

I was planning to do this at some point, but it seems this is required now as the tracking move into the core requires it.

WDYT?

atenart commented 1 year ago

Except for the C headers needed rework, pushed a new version which includes:

amorenoz commented 1 year ago

I have rebased some work I have on top of this series and OVS tracking broke. While bisecting, I found that: 80877416d39eae537418890bd3e23d50935d1840: core: probe: kernel: allow not to report events

does not build:

cargo build
   Compiling retis v0.1.0 (/home/amorenoz/src/retis)
error[E0609]: no field `global_probes_options` on type `&mut ProbeManager`
  --> src/core/probe/manager.rs:88:14
   |
88 |             .global_probes_options
   |              ^^^^^^^^^^^^^^^^^^^^^ unknown field
   |
   = note: available fields are: `dynamic_probes`, `dynamic_hooks`, `filters`, `config_map`, `options` ... and 2 others

error[E0609]: no field `global_probes_options` on type `&mut ProbeManager`
  --> src/core/probe/manager.rs:95:14
   |
95 |         self.global_probes_options.push(opt);
   |              ^^^^^^^^^^^^^^^^^^^^^ unknown field
   |
   = note: available fields are: `dynamic_probes`, `dynamic_hooks`, `filters`, `config_map`, `options` ... and 2 others

For more information about this error, try `rustc --explain E0609`.
error: could not compile `retis` due to 2 previous errors
atenart commented 1 year ago

I have rebased some work I have on top of this series and OVS tracking broke. While bisecting, I found that: 8087741: core: probe: kernel: allow not to report events

does not build:

Thanks for the heads up, will fix in next version.

atenart commented 1 year ago

Fixed a few tracking issues and split the headers to avoid including the tracking one in the middle of the common header.

atenart commented 1 year ago

Rebased on latest main.