pktgen / Pktgen-DPDK

DPDK based packet generator
Other
389 stars 119 forks source link

PCAP replay #285

Closed dariusgrassi closed 1 week ago

dariusgrassi commented 1 month ago

Hello,

I am having issues with enabling PCAP replay mode in Pktgen-DPDK. The behavior I observe is after supplying Pktgen with a PCAP through -s mode, I cannot enable PCAP mode nor inspect the PCAP, despite Pktgen correctly detecting the file.

Environment:

  • Pktgen version: 24.07.1
  • DPDK version: 23.11 LTS
  • OS distribution: Ubuntu 22.04 LTS
  • Arch: x86-64
  • Kernel version: 5.15.0-122-generic
  • NIC: Mellanox ConnectX-4 25 GB NIC

Here is the pcap I am trying to run, it only contains 178 packets:

$ file /tmp/rdma_read.pcap
/tmp/rdma_read.pcap: pcap capture file, microsecond ts (little-endian) - version 2.4 (Ethernet, capture length 262144)

To start Pktgen on a single port, I run the following command:

sudo -E ./usr/local/bin/pktgen -l 2,3-4,5-6,18-19,10-11 -n 4 --proc-type auto --log-level 7 --file-prefix pg -b 03:00.0 -b 07:00.0 -b 07:00.1 -- -v -T -P -m [3-4:5-6].0 -s 0:/tmp/rdma_read.pcap -f themes/black-yellow.theme

Here is the full output when starting Pktgen with this command:

*** Copyright(c) <2010-2024>, Intel Corporation. All rights reserved.
*** Pktgen  created by: Keith Wiles -- >>> Powered by <<<

EAL: Detected CPU lcores: 20
EAL: Detected NUMA nodes: 1
EAL: Auto-detected process type: PRIMARY
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/pg/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Probe PCI driver: mlx5_pci (15b3:1015) device: 0000:03:00.1 (socket 0)
TELEMETRY: No legacy callbacks, legacy socket not created
PCAP: MAGIC_NUMBER 0xa1b2c3d4 == 0xa1b2c3d4, Convert: No
PCAP: Max Packet Size: 1216
  Create: 'RX-L3/P0/S0     ' - Memory used (MBUFs 16,384 x size  2,176) =   34,817 KB @ 0x1006cc740
  Create: 'TX-L3/P0/S0     ' - Memory used (MBUFs 16,384 x size  2,176) =   34,817 KB @ 0x1004e8900
  Create: 'SP-L3/P0/S0     ' - Memory used (MBUFs  1,024 x size  2,176) =    2,177 KB @ 0x100206100
                                                      Total memory used =   71,811 KB
>>> Packet Max Burst 128/128, RX Desc 1024, TX Desc 2048, mbufs/port 24576, mbuf cache 128
Initialize Port 0 ...
                                                         Port memory used =  71811 KB
** Device Info (0000:03:00.1, if_index:5, flags 00000075) **
   min_rx_bufsize :   32  max_rx_pktlen     :65536  hash_key_size :   40
   max_rx_queues  : 1024  max_tx_queues     : 1024  max_vfs       :    0
   max_mac_addrs  :  128  max_hash_mac_addrs:    0  max_vmdq_pools:    0
   vmdq_queue_base:    0  vmdq_queue_num    :    0  vmdq_pool_base:    0
   nb_rx_queues   :    2  nb_tx_queues      :    2  speed_capa    : 00000520

   flow_type_rss_offloads:f00000000803afbc  reta_size             :  512
   rx_offload_capa       :VLAN_STRIP IPV4_CKSUM UDP_CKSUM TCP_CKSUM VLAN_FILTER SCATTER TIMESTAMP KEEP_CRC RSS_HASH BUFFER_SPLIT
   tx_offload_capa       :VLAN_INSERT IPV4_CKSUM UDP_CKSUM TCP_CKSUM TCP_TSO OUTER_IPV4_CKSUM VXLAN_TNL_TSO GRE_TNL_TSO GENEVE_TNL_TSO MULTI_SEGS MBUF_FAST_FREE UDP_TNL_TSO IP_TNL_TSO
   rx_queue_offload_capa :000000000019600f  tx_queue_offload_capa :0000000000000000
   dev_capa              :0000000000000010

  RX Conf:
     pthresh        :    0 hthresh          :    0 wthresh        :    0
     Free Thresh    :    0 Drop Enable      :    0 Deferred Start :    0
     offloads       :0000000000000000
  TX Conf:
     pthresh        :    0 hthresh          :    0 wthresh        :    0
     Free Thresh    :    0 RS Thresh        :    0 Deferred Start :    0
     offloads       :0000000000000000
  Rx: descriptor Limits
     nb_max         :65535  nb_min          :    0  nb_align      :    1
     nb_seg_max     :65535  nb_mtu_seg_max  :65535
  Tx: descriptor Limits
     nb_max         :65535  nb_min          :    0  nb_align      :    1
     nb_seg_max     :   40  nb_mtu_seg_max  :   40
  Rx: Port Config
     burst_size     :   64  ring_size       :  256  nb_queues     :    8
  Tx: Port Config
     burst_size     :   64  ring_size       :  256  nb_queues     :    8
  Switch Info: 0000:03:00.1
     domain_id      :    0  port_id         :65535

Port DevName          Index NUMA PCI Information   Src MAC           Promiscuous
  0  mlx5_pci             5    0 15b3:1015/0000:03:00.1 9c:dc:71:4b:83:71 <Enabled>

=== Display processing on lcore 2
RX lid   3, pid  0, qid  0, Mempool RX-L3/P0/S0      @ 0x1006cc740
RX lid   4, pid  0, qid  1, Mempool RX-L3/P0/S0      @ 0x1006cc740
TX lid   5, pid  0, qid  0, Mempool TX-L3/P0/S0      @ 0x1004e8900
TX lid   6, pid  0, qid  1, Mempool TX-L3/P0/S0      @ 0x1004e8900
*** Logical core  10 has no work, skipping launch
*** Logical core  11 has no work, skipping launch
- <Main Page> Ports 0-0 of 1  Copyright(c) <2010-2024>, Intel Corporation
Port:Flags          :   0:P------        Unkn
Link State          :           <UP-25000-FD>        ---Total Rate---
Pkts/s Rx           :                       0                       0
       Tx           :                       0                       0
MBits/s Rx/Tx       :                     0/0                     0/0
Total Rx Pkts       :                       0                       0
      Tx Pkts       :                       0                       0
      Rx/Tx MBs     :                     0/0
Pkts/s Rx Max       :                       0
       Tx Max       :                       0
Errors Rx/Tx        :                     0/0
Broadcast           :                       0
Multicast           :                       0
Sizes 64            :                       0
      65-127        :                       0
      128-255       :                       0
      256-511       :                       0
      512-1023      :                       0
      1024-1518     :                       0
Runts/Jumbos        :                     0/0
ARP/ICMP Pkts       :                     0/0
Tx Count/% Rate     :           Forever /100%
Pkt Size/Rx:Tx Burst:             64 / 64: 32
Port Src/Dest       :              1234/ 5678
Pkt Type:VLAN ID    :         IPv4 / TCP:0001
IP  Destination     :             192.168.1.1
    Source          :          192.168.0.1/24
MAC Destination     :       9c:dc:71:4b:83:71
    Source          :       9c:dc:71:4b:83:71
NUMA/Vend:ID/PCI    :0/15b3:1015/0000:03:00.1
-- Pktgen 24.07.1  Powered by DPDK 23.11.1 (pid:20899) ------------------------

The output verifies that the PCAP path was successfully detected and parsed:

PCAP: MAGIC_NUMBER 0xa1b2c3d4 == 0xa1b2c3d4, Convert: No
PCAP: Max Packet Size: 1216

However, inside Pktgen, the PCAP file is not loaded nor detected:

Pktgen:/> enable 0 pcap
Pktgen:/> pcap show
!ERROR!:  ** PCAP file is not loaded on port 0

Upon seeing this output, I also inspected its source, which seems to be triggered in app/cli-functions.c, due to the return of l2p_get_pcap(pinfo->pid) being NULL. However, it's unclear to me how this is the case unless the struct pcap_info_t is not being set properly.

Thus, I am unsure whether this behavior is due to a mistake on my end, intended behavior, a bug, or whether support has been deprecated. I would appreciate assistance with this as soon as possible. Thank you!

KeithWiles commented 1 month ago

It will be difficult for me to fully investigate this problem as work is too high ATM, but I have had a number of problems reported with these types of NICs. Type page pcap it may have some more information.

If you can try using a Intel NIC and see if it works it may help narrow down the problem.

dariusgrassi commented 1 month ago

The PCAP page is also empty:

/ <PCAP Page> Ports 0-0 of 1  Copyright(c) <2010-2024>, Intel Corporation

Pktgen:/>

Thanks for the quick response. I'll post again to this thread if I can reproduce/fix the problem on Intel NICs.

pchaseh commented 1 month ago

I'm also able to reproduce this on a ConnectX-5, unfortunately no access to any other type of NIC at this time to rule out Mellanox as being the problem. In my case, l2p_set_pcap_info in app/pktgen-main.c silently fails

EDIT: If I duplicate the call to l2p_parse_mapping before l2p_set_pcap_info pktgen recognizes that there's a PCAP file loaded on that port, so it seems the problem is that the port-to-PID mapping never has the chance to be initialized (see https://github.com/pktgen/Pktgen-DPDK/blob/main/app/l2p.c#L212). I believe that 4ddef5abb9c2531580a4421bbd21489b503c2ec6 thus broke the PCAP replay feature.

Note that even with this change, after doing start all pktgen stats report 0 packets transmitted before I get a crash shortly thereafter

pchaseh commented 1 month ago

@dariusgrassi I've confirmed that 24.03.1 doesn't have this problem, so I would advise using no later a release than that if PCAP replay functionality is an immediate must

dariusgrassi commented 1 month ago

@pchaseh You've been immensely helpful. I've confirmed that PCAP replays are still functional for MLX NICs in v24.03.1.

KeithWiles commented 1 month ago

I worked on PCAP crash today and it appears to work. Please give the branch fix-pcap-crash a try and let me know.

dariusgrassi commented 1 month ago

Thanks for taking a look. When I try to load the pcap on my end, I'm seeing a new error. Here are my logs:

Entering Pktgen-DPDK...
>>> sdk '/users/dwg/dpdk-stable-23.11.1', target 'x86_64-native-linux-gcc'
<module 'cfg' from 'cfg/default.cfg'>
Setup DPDK to run 'pktgen' application from cfg/default.cfg file
Notice: 0000:41:00.0 already bound to driver mlx5_core, skipping
>>> sdk '/users/dwg/dpdk-stable-23.11.1', target 'x86_64-native-linux-gcc'
<module 'cfg' from 'cfg/default.cfg'>
   Trying ./usr/local/bin/pktgen
sudo -E LD_LIBRARY_PATH=.:/users/dwg/dpdk-stable-23.11.1/x86_64-native-linux-gcc/lib/x86_64-linux-gnu ./usr/local/bin/pktgen -l 1,2-4,5-7,8-10,11-13 -n 4 --proc-type auto --log-level 7 --file-prefix pg -a 41:00.0 -- -v -T -P -G -m [5:6-7].0 -s 0:/tmp/mawilab.pcap -f themes/black-yellow.theme

*** Copyright(c) <2010-2024>, Intel Corporation. All rights reserved.
*** Pktgen  created by: Keith Wiles -- >>> Powered by <<<

EAL: Detected CPU lcores: 32
EAL: Detected NUMA nodes: 1
EAL: Auto-detected process type: PRIMARY
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/pg/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: VFIO support initialized
EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: 0000:41:00.0 (socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
EAL: Error - exiting with code: 1
  Cause: pktgen_pcap_add: rte_zmalloc_socket() failed for pcap_info_t structure

My pcap file is only 9 KB, so I don't believe its an issue with the file size. The error output seems to be triggered here.

KeithWiles commented 1 month ago

This seems like the pcap crash with a system only having a single NUMA region and rte_eth_dev_socket_id(pid) returning -1 value. In a patch I have been working on to put a wrapper function around rte_zmalloc_socket_id() make sure -1 is not returned.

Please give branch fixes-for-release a try.

KeithWiles commented 1 week ago

Please try the new Pktgen release 24.10.0 with the latest DPDK 24.11.0-rc1 as DPDK changed and caused compile problems