inspektor-gadget / inspektor-gadget

Inspektor Gadget is a set of tools and framework for data collection and system inspection on Kubernetes clusters and Linux hosts using eBPF
https://www.inspektor-gadget.io
Apache License 2.0
2.26k stars 245 forks source link

[RFE] New gadget for qdisc latency #3118

Open alban opened 4 months ago

alban commented 4 months ago

Current situation

We don't have a gadget telling how much time a network packet is waiting in the network scheduler.

Impact

This is one piece missing to investigate network problems.

Ideal future situation

Histogram showing latency in network scheduler, similar to top-block-io gadget.

Implementation options

We can use the following tracepoints:

/sys/kernel/tracing/events/qdisc/qdisc_enqueue/format
/sys/kernel/tracing/events/qdisc/qdisc_dequeue/format
/sys/kernel/tracing/events/skb/consume_skb/format
/sys/kernel/tracing/events/skb/kfree_skb/format

We can use a hash table keyed by the skb and measure the time between the enqueue and the dequeue. Note that the dequeue could dequeue several skbs in the same call, so we also need consume_skb and kfree_skb to clean up the map.

Additional information

cc @blanquicet

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs.

patrickpichler commented 2 months ago

Hey :) Could you assign me this ticke? I would like to try implementing this gadget.

patrickpichler commented 1 month ago

@alban I've implemented a basic version of this, is it ok if I open a PR?

alban commented 1 month ago

@alban I've implemented a basic version of this, is it ok if I open a PR?

Yes, please do :)

patrickpichler commented 1 month ago

One thing though, I checked the kernel sources and the qdisc:qdisc_enqueue tracepoint appears to be added in 5.14. Is this OK, or should I search for an alternative function to hook?

alban commented 3 weeks ago

One thing though, I checked the kernel sources and the qdisc:qdisc_enqueue tracepoint appears to be added in 5.14. Is this OK, or should I search for an alternative function to hook?

For me this is fine.