OpenCloudOS / nettrace

nettrace is a eBPF-based tool to trace network packet and diagnose network problem.
Other
326 stars 80 forks source link

Why dose it have no any ouput packet information? #36

Closed xuinsd45sd closed 1 year ago

xuinsd45sd commented 1 year ago

Hi, I want to trace all incoming UDP packet in the function of "__netif_receive_skb_core", but seems that there's no any valid output. However, when I enabled the debug mode, I can actually see the nettrace have captured something like "DEBUG: tp found for __netif_receive_skb_core(ffffff8001f95300), ctx:33636760:1". So why it doesn't print anything in normal mode(without --debug)?

My specific commands as follows:

# nettrace -t __netif_receive_skb_core -p udp
begin trace...

^C
/ # 
/ # 
/ # nettrace -t __netif_receive_skb_core -p udp --debug
DEBUG: command: cat /sys/kernel/debug/tracing/events/skb/kfree_skb/format | grep NOT_SPECIFIED
DEBUG: nft high version: 0
DEBUG: command: uname -r | awk -F '.' '$1*100+$2<504{exit 1}'
DEBUG: command: uname -r | awk -F '.' '$1*100+$2>=504{exit 1}'
DEBUG: attaching __trace___netif_receive_skb_core_pskb to __netif_receive_skb_core
attach __netif_receive_skb_core success
DEBUG: attaching __trace___kfree_skb to __kfree_skb
attach __kfree_skb success
DEBUG: attaching __trace_skb_clone to skb_clone
attach skb_clone success
attach kretprobe ret__trace_skb_clone to skb_clone success
DEBUG: attaching __trace_consume_skb
DEBUG: attaching __trace_kfree_skb
begin trace...
DEBUG: create entry: 33621220, ffffff8001f95300
DEBUG: fake ctx alloc: 33636870, ffffff8001f95300
DEBUG: tp found for __netif_receive_skb_core(ffffff8001f95300), ctx:33636760:1
DEBUG: create entry: 33620af0, ffffff8001f95300
DEBUG: tp found for __netif_receive_skb_core(ffffff8001f95300), ctx:33636760:1
DEBUG: create entry: 33620b60, ffffff8001f95300
DEBUG: tp found for __netif_receive_skb_core(ffffff8001f95300), ctx:33636760:1
DEBUG: create entry: 33621300, ffffff8001f95300
menglongdong commented 1 year ago

By default, nettrace is in "life" mode, as it won't print the packet until it is freed (kfree_skb or consume_skb). In your case, seems the free of packet is not captured, which I am not sure why.

Does other packet can be captured normally? Such as "nettrace -t icmp"?

You can add the --basic argument to use "basic" mode, which will print the trace log immedately.

xuinsd45sd commented 1 year ago

By default, nettrace is in "life" mode, as it won't print the packet until it is freed (kfree_skb or consume_skb). In your case, seems the free of packet is not captured, which I am not sure why. Yes , free of packet is not captured.

Does other packet can be captured normally? Such as "nettrace -t icmp"? No, no packet can be captured in "life" mode. I think I know the reason why the free action of packet is not captured in my machine. The truth is that "__kfree_skb()" was optimized from "kfree_skb()" by the compiler. See the function call graph as follows: 0) | kfree_skb() { 0) | skb_release_all() { 0) + 18.576 us | skb_release_head_state(); 0) | skb_release_data() { 0) | skb_free_head() { 0) + 17.744 us | page_frag_free(); 0) + 55.872 us | }

Does nettrace only track __kfree_skb instead of kfree_skb to capture the free action of skb?

You can add the --basic argument to use "basic" mode, which will print the trace log immedately. Thanks for patient reply.

menglongdong commented 1 year ago

Not really. In fact, nettrace will trace the kernel function __kfree_skb() and the tracepoint skb/kfree_skb, skb/consume_skb togather. You can find that they are attached with the command 'bpftool perf show'.

Don't know why, seems skb/kfree_skb and skb/consume_skb are not triggered.

Can you give me your kernel config? (zcat /proc/config.gz). You can send it to my email imagedong@tencent.com, and I think it will be easier to fingure it out.

Thanks!

menglongdong commented 1 year ago

I think this issue is sloved in the latest code. The max entries that we defined for the perf event array map is 64, which makes the trace event will be lost if it happens in the 64-~ cores. And if your computer has more then 64 cores, this can happen.