Closed Bigdelle closed 3 months ago
cc @cilium/sig-hubble
cc @cilium/sig-datapath
On a very high-level - you're proposing that the datapath matches some specific IP option in the packet (containing some trace_id
), and transfers the trace_id
value to userspace via the trace_notify
struct ?
As an alternative thought - afaik the datapath is already punting some(?) packet headers to userspace. If you increase the number of punted bytes (so that it includes the IP options), would that allow you to move all the needed IP option-parsing into hubble?
On a very high-level - you're proposing that the datapath matches some specific IP option in the packet (containing some
trace_id
), and transfers thetrace_id
value to userspace via thetrace_notify
struct ?As an alternative thought - afaik the datapath is already punting some(?) packet headers to userspace. If you increase the number of punted bytes (so that it includes the IP options), would that allow you to move all the needed IP option-parsing into hubble?
We want the ip_trace_id
available at the Kernel level. This PR sets up the infrastructure to expand on this feature as we continue developing it; the next milestone (linked in the Future Milestones
section of the CFP) involves restoring the ip_trace_id
for reply packets using conntrack.
## Future Milestones
### Deferred Milestone 1: Use conntrack to restore trace ID for reply packets
- When a packet includes a trace ID, the conntrack entry should record the ID along with the connection information.
- For subsequent packets – if a trace ID is not present in the IP Options – use the track ID from conntrack to restore the internal state.
- Note this will not modify the IP option header.
This will ensure that subsequent packets in the same connection maintain their traceability even if they do not carry the ip_trace_id
explicitly in their IP options. If we push the IP option parsing to Hubble, we lose this ability.
Also, by pushing this to Hubble, we lose integration with the cilium monitor for events. Considering the future customer use-cases of this feature, we want the ability to see the ip_trace_id
in the monitor.
Is there a specific concern you have about including this in the trace_notify and drop_notify structs? (these fields would only be populated if the feature is enabled and there is an embedded ip_trace_id
).
CC: @sypakine
I think this CFP is in a solid state for implementing. There's a couple of details that could maybe be hashed out a bit more, but I don't think they ultimately detract that much from the overall proposal, and may be impacted by the experience of implementing it. The proposal has been out plenty long enough for @cilium/sig-hubble and @cilium/sig-datapath to provide feedback, and there's now committers endorsing the proposal, so I think this is in a good enough state to merge.
As and when the implementation matures and impacts the design, we can always iterate further on this CFP, but we don't have to hold it as a PR during that period. Let's get this merged :rocket:
Add PR for CFP: Packet Tracing.