inspektor-gadget / inspektor-gadget

Inspektor Gadget is a set of tools and framework for data collection and system inspection on Kubernetes clusters and Linux hosts using eBPF
https://www.inspektor-gadget.io
Apache License 2.0
2.22k stars 241 forks source link

[RFE] New gadget for qdisc latency #3118

Open alban opened 4 months ago

alban commented 4 months ago

Current situation

We don't have a gadget telling how much time a network packet is waiting in the network scheduler.

Impact

This is one piece missing to investigate network problems.

Ideal future situation

Histogram showing latency in network scheduler, similar to top-block-io gadget.

Implementation options

We can use the following tracepoints:

/sys/kernel/tracing/events/qdisc/qdisc_enqueue/format
/sys/kernel/tracing/events/qdisc/qdisc_dequeue/format
/sys/kernel/tracing/events/skb/consume_skb/format
/sys/kernel/tracing/events/skb/kfree_skb/format

We can use a hash table keyed by the skb and measure the time between the enqueue and the dequeue. Note that the dequeue could dequeue several skbs in the same call, so we also need consume_skb and kfree_skb to clean up the map.

Additional information

cc @blanquicet

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs.

patrickpichler commented 1 month ago

Hey :) Could you assign me this ticke? I would like to try implementing this gadget.

patrickpichler commented 3 weeks ago

@alban I've implemented a basic version of this, is it ok if I open a PR?

alban commented 3 weeks ago

@alban I've implemented a basic version of this, is it ok if I open a PR?

Yes, please do :)

patrickpichler commented 3 weeks ago

One thing though, I checked the kernel sources and the qdisc:qdisc_enqueue tracepoint appears to be added in 5.14. Is this OK, or should I search for an alternative function to hook?

alban commented 5 days ago

One thing though, I checked the kernel sources and the qdisc:qdisc_enqueue tracepoint appears to be added in 5.14. Is this OK, or should I search for an alternative function to hook?

For me this is fine.