retis-org / retis

Tracing packets in the Linux networking stack & friends
https://retis.readthedocs.io/en/stable/
100 stars 14 forks source link

Auto-probe mode #309

Closed atenart closed 9 months ago

atenart commented 11 months ago

This enables collection to use a new "auto-probe" mode, working as follow: stack traces are requested and functions found in those stack traces are evaluated and additional probes are added if they fit the requirements. Using this mode can help building a list of probes to use for following a given flow. More details about the implementation in the commit messages.

Part of this series is another rework of the probe manager. But fear not, there is no big functional changes and the code is very close to what it was. But the logic is better defined and that allows additional things, such as runtime registration of probes.

This is based on a few patches that are in other PRs (btf-rs update patches mainly). This is fine as they're all targeted at an earlier milestone.

One future possibility is the generation of a profile at the end of the collection, so it could be reused once where to probe is known.

$ retis collect -a -f 'icmp and host 1.1.1.1' --cmd 'ping 1.1.1.1'

Fixes #88.

amorenoz commented 10 months ago

I haven't had the time to look deeply in to the code, but I have a question about user-friendliness.

Is it just me or does it feel strange to force users to write a packet filter? I guess the reason is that functions that are hit by the "cmd" are very likely to be used by pretty much all traffic in the system and it's a way to avoid flooding the user with tons of events. But still, that seems to be pushing to the user some internal implementation detail.

Maybe a possibility would be to support filtering by PID (which IMHO, has value in it's own). Combined with --cmd, it would allow the user to add many probes and no packet-filter without fearing event flooding or even run commands with still-unknown network behavior.

Also, not sure if the wording of the options conveys the fact that probes are added incrementally but after they are hit at least once. If the user's "reproducer" is on the first packet (e.g: TCP SYN), they might loose some important events. In that case, the user would have to add probes manually so they are pre-loaded. I like the idea of automatically adding probes, I'm just concerned if the users will understand the implications.

atenart commented 10 months ago

Is it just me or does it feel strange to force users to write a packet filter? I guess the reason is that functions that are hit by the "cmd" are very likely to be used by pretty much all traffic in the system and it's a way to avoid flooding the user with tons of events. But still, that seems to be pushing to the user some internal implementation detail.

Right, the reason to require a filter is if no filter is used there are high chances the result would be similar to retis collect --probe * which we do not support (too many probes).

Maybe a possibility would be to support filtering by PID (which IMHO, has value in it's own). Combined with --cmd, it would allow the user to add many probes and no packet-filter without fearing event flooding or even run commands with still-unknown network behavior.

That is a good idea, and it falls into the "a filter is required" IMHO. The intention is to allow all available filter methods for auto-probe, the packet one being the only one available when this was firstly written. Before merging this PR we should allow meta-filtering too. For PID filtering, it's a good follow-up IMHO but is just doesn't exist yet. Or maybe you had in mind using meta filtering for this?

Also, not sure if the wording of the options conveys the fact that probes are added incrementally but after they are hit at least once. If the user's "reproducer" is on the first packet (e.g: TCP SYN), they might loose some important events. In that case, the user would have to add probes manually so they are pre-loaded. I like the idea of automatically adding probes, I'm just concerned if the users will understand the implications.

You're right, we should make this more explicit. What about an additional sentence in the help, like "Note that additional probes are added only after events including them in their stack trace are reported first; this means the first packets hitting a probe won't be reported."

In addition, I'd like one day to be able to generate a profile based on the auto-probe state at the end of a capture; so we can generate a profile based on a scenario and then use it to see all the packets in a subsequent collection.

atenart commented 9 months ago