xdp-project / xdp-tutorial

XDP tutorial
2.43k stars 574 forks source link

advanced03-AF_XDP: How to measure packet loss due to unavailable UMEM frames? #115

Open simonhf opened 4 years ago

simonhf commented 4 years ago

Below I reproduce the issue using a PCAP file I had laying around. I modify the advanced03-AP-XDP tutorial code to not free each UMEM frame after it is processed in user land. There are 4,096 frames in the tutorial example. By sending 5,000 packets via tcpreplay then I see what happens after all 4,096 frames have been allocated.

What happens is that tcpreplay says it sent all 5,000 packets just fine. ifconfig reports that 5,000 packets were received and no packets dropped. The af_xdp_user tutorial example code reports receiving 4,096 packets as expected. So how to / is there a way to report the packets 'dropped' due to no frames being available?

$ git clone --recurse-submodules https://github.com/xdp-project/xdp-tutorial
$ cd xdp-tutorial/advanced03-AF_XDP/

$ cp af_xdp_kern.c af_xdp_kern.c.orig
$ vim af_xdp_kern.c # hack to pass all packets via AF_XDP instead of alternate packets
$ diff -u af_xdp_kern.c.orig af_xdp_kern.c
--- af_xdp_kern.c.orig  2020-03-26 17:00:57.675564970 -0700
+++ af_xdp_kern.c       2020-03-26 17:06:43.738565255 -0700
@@ -22,6 +22,7 @@
 int xdp_sock_prog(struct xdp_md *ctx)
 {
     int index = ctx->rx_queue_index;
+#if 0
     __u32 *pkt_count;

     pkt_count = bpf_map_lookup_elem(&xdp_stats_map, &index);
@@ -32,6 +33,7 @@
             return XDP_PASS;
     }

+#endif
     /* A set entry here means that the correspnding queue_id
      * has an active AF_XDP socket bound to it. */
     if (bpf_map_lookup_elem(&xsks_map, &index))

$ cp af_xdp_user.c af_xdp_user.c.orig
$ vim af_xdp_user.c # hack to show only received packets every second
$ diff -u af_xdp_user.c.orig af_xdp_user.c
--- af_xdp_user.c.orig  2020-03-26 17:16:54.556724110 -0700
+++ af_xdp_user.c       2020-03-26 17:18:28.431878556 -0700
@@ -457,6 +457,7 @@
               stats_rec->rx_bytes / 1000 , bps,
               period);

+#if 0
        packets = stats_rec->tx_packets - stats_prev->tx_packets;
        pps     = packets / period;

@@ -468,11 +469,12 @@
               period);

        printf("\n");
+#endif
 }

 static void *stats_poll(void *arg)
 {
-       unsigned int interval = 2;
+       unsigned int interval = 1;
        struct xsk_socket_info *xsk = arg;
        static struct stats_record previous_stats = { 0 };

$ make && time sudo sh -s << 'EOF' 2>&1
sudo ../testenv/testenv.sh teardown --name veth-advanced03 ; sudo ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
timeout --signal=SIGINT 4 ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c "sleep 1 ; tcpreplay -i veth0 --topspeed --stats=1 --loop 0 --limit 5000 /home/simon/web-browser-toolbars.trace"
wait $PIDS ; echo "- af_xdp_user process ended"
ifconfig veth-advanced03
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
AF_XDP RX:             1 pkts (         1 pps)           0 Kbytes (     0 Mbits/s) period:1.000681
Test start: 2020-03-27 09:42:31.362663 ...
Test complete: 2020-03-27 09:42:31.372767
Actual: 5000 packets (4730445 bytes) sent in 0.010104 seconds
Rated: 468175475.0 Bps, 3745.40 Mbps, 494853.52 pps
Statistics for network device: veth0
        Successful packets:        5000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:          4096 pkts (      4091 pps)        3778 Kbytes (    30 Mbits/s) period:1.001013
AF_XDP RX:          4096 pkts (         0 pps)        3778 Kbytes (     0 Mbits/s) period:1.001233
AF_XDP RX:          4096 pkts (         0 pps)        3778 Kbytes (     0 Mbits/s) period:1.000157
INFO: xdp_link_detach() removed XDP prog ID:251 on ifindex:121
- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::6cee:85ff:fe03:b4d3  prefixlen 64  scopeid 0x20<link>
        ether 6e:ee:85:03:b4:d3  txqueuelen 1000  (Ethernet)
        RX packets 5007  bytes 4731179 (4.7 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 27  bytes 4246 (4.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
real    0m4.290s
tohojo commented 4 years ago

Simon Hardy-Francis notifications@github.com writes:

Below I reproduce the issue using a PCAP file I had laying around. I modify the advanced03-AP-XDP tutorial code to not free each UMEM frame after it is processed in user land. There are 4,096 frames in the tutorial example. By sending 5,000 packets via tcpreplay then I see what happens after all 4,096 frames have been allocated.

What happens is that tcpreplay says it sent all 5,000 packets just fine. ifconfig reports that 5,000 packets were received and no packets dropped. The af_xdp_user tutorial example code reports receiving 4,096 packets as expected. So how to / is there a way to report the packets 'dropped' due to no frames being available?

I think those should show up as xdp_redirect exceptions. There's a tracepoint you can attach to that will show those; see the xdp_monitor example in samples/bpf in the kernel tree...

simonhf commented 4 years ago

@tohojo Thanks for the reply! I tried running with perf stat -- please see below -- but it's not showing any non-zero XDP tracepoints... although it 'sees' a bunch of them.

Do I need to use perf in a different way? Or how else can I access the tracepoints?

$ make && time sudo sh -s << 'EOF' 2>&1
> sudo ../testenv/testenv.sh teardown --name veth-advanced03 ; sudo ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
> timeout --signal=SIGINT 4 perf stat -e 'xdp:*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
> ip netns exec veth-advanced03 sh -c "sleep 1 ; tcpreplay -i veth0 --topspeed --stats=1 --loop 0 --limit 5000 /home/simon/web-browser-toolbars.trace"
> wait $PIDS ; echo "- af_xdp_user process ended"
> ifconfig veth-advanced03
> EOF
cc -Wall -I../libbpf/src//build/usr/include/ -g -I../headers/ -L../libbpf/src/ -o af_xdp_user ../common//common_params.o ../common//common_user_bpf_xdp.o \
 af_xdp_user.c -l:libbpf.a -lelf  -lpthread
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
Test start: 2020-03-27 13:08:49.078255 ...
AF_XDP RX:          1049 pkts (      1048 pps)         704 Kbytes (     6 Mbits/s) period:1.001089
Test complete: 2020-03-27 13:08:49.087541
Actual: 5000 packets (4730445 bytes) sent in 0.009286 seconds
Rated: 509416864.0 Bps, 4075.33 Mbps, 538444.97 pps
Statistics for network device: veth0
        Successful packets:        5000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:          4096 pkts (      3028 pps)        3778 Kbytes (    24 Mbits/s) period:1.000779
AF_XDP RX:          4096 pkts (         0 pps)        3778 Kbytes (     0 Mbits/s) period:1.000409
INFO: xdp_link_detach() removed XDP prog ID:367 on ifindex:179

 Performance counter stats for './af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force':

                 0      xdp:xdp_exception                                           
                 0      xdp:xdp_bulk_tx                                             
                 0      xdp:xdp_redirect                                            
                 0      xdp:xdp_redirect_err                                        
                 0      xdp:xdp_redirect_map                                        
                 0      xdp:xdp_redirect_map_err                                    
                 0      xdp:xdp_cpumap_kthread                                      
                 0      xdp:xdp_cpumap_enqueue                                      
                 0      xdp:xdp_devmap_xmit                                         
                 0      xdp:mem_disconnect                                          
                 0      xdp:mem_connect                                             
                 0      xdp:mem_return_failed                                       

       3.917353258 seconds time elapsed

       3.888387000 seconds user
       0.015985000 seconds sys

- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::34bb:e4ff:fe85:f61e  prefixlen 64  scopeid 0x20<link>
        ether 36:bb:e4:85:f6:1e  txqueuelen 1000  (Ethernet)
        RX packets 5006  bytes 4731089 (4.7 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 29  bytes 4506 (4.5 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
real    0m4.612s
simonhf commented 4 years ago

Update: I got it work by modifying the perf command line slightly:

$ make && time sudo sh -s << 'EOF' 2>&1
sudo ../testenv/testenv.sh teardown --name veth-advanced03 ; sudo ../testenv/testenv.sh setup --legacy-ip --name veth-advanced03
timeout --signal=SIGINT 4 perf stat  --all-cpus --event 'xdp:*' ./af_xdp_user --dev veth-advanced03 --filename af_xdp_kern.o --skb-mode --force & PIDS=$!
ip netns exec veth-advanced03 sh -c "sleep 1 ; tcpreplay -i veth0 --topspeed --stats=1 --loop 0 --limit 5000 /home/simon/web-browser-toolbars.trace"
wait $PIDS ; echo "- af_xdp_user process ended"
ifconfig veth-advanced03
EOF
Tearing down environment 'veth-advanced03'
Setting up new environment 'veth-advanced03'
...
AF_XDP RX:             1 pkts (         1 pps)           0 Kbytes (     0 Mbits/s) period:1.001207
Test start: 2020-03-27 14:33:42.110309 ...
Test complete: 2020-03-27 14:33:42.120326
Actual: 5000 packets (4730445 bytes) sent in 0.010017 seconds
Rated: 472241689.1 Bps, 3777.93 Mbps, 499151.44 pps
Statistics for network device: veth0
        Successful packets:        5000
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
AF_XDP RX:          4096 pkts (      4094 pps)        3778 Kbytes (    30 Mbits/s) period:1.000185
AF_XDP RX:          4096 pkts (         0 pps)        3778 Kbytes (     0 Mbits/s) period:1.000344
INFO: xdp_link_detach() removed XDP prog ID:387 on ifindex:189

 Performance counter stats for 'system wide':

                 0      xdp:xdp_exception                                           
                 0      xdp:xdp_bulk_tx                                             
                 0      xdp:xdp_redirect                                            
                 0      xdp:xdp_redirect_err                                        
             4,096      xdp:xdp_redirect_map                                        
               907      xdp:xdp_redirect_map_err                                    
                 0      xdp:xdp_cpumap_kthread                                      
                 0      xdp:xdp_cpumap_enqueue                                      
                 0      xdp:xdp_devmap_xmit                                         
                 0      xdp:mem_disconnect                                          
                 0      xdp:mem_connect                                             
                 0      xdp:mem_return_failed                                       

       3.962917812 seconds time elapsed

- af_xdp_user process ended
veth-advanced03: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.11.1.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fc00:dead:cafe:1::1  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::cc44:cdff:fecf:9619  prefixlen 64  scopeid 0x20<link>
        ether ce:44:cd:cf:96:19  txqueuelen 1000  (Ethernet)
        RX packets 5006  bytes 4731089 (4.7 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 29  bytes 4506 (4.5 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
real    0m4.673s