Closed bear110 closed 6 months ago
What version of Pktgen and DPDK are you using? What is the NIC you are using? What is the application you are targeting the traffic? How can I setup an environment to test this problem?
I setup two pktgen's each on a different machine using i40e NICs back to back between the machines.
I can set the TX count on one pktgen to 100000 packet @ 64 bytes do a start 0 multiple times. All of the packets arrive on the other machine.
Hi Keith, thanks for your response. I test on Netronome CX 2x25Gb, which is connected with Mellanox CX-6 on the other machine. Pktgen run on mellanox CX-6, and testpmd run on Netronome side, which run macswap loopback mode. DPDK version 21.11, and conresponding Pktgen version. We also tried several latest version of DPDK/pktgen, it seem that there's no relation with the version. Set TX count on pktgen to more than 400000 @64 bytes and do start 0. We do some analysis on this, it might be caused by the burst handing capability of RX side (Netronome), -- the NIC's performance is not good as Intel's or Mellanox's -- But the missing only happens when pktgen startup, after running a while(less than 5s), there's no missing. We also tried with XENA/Sprint, there's no such problem when using hardware testor.
I can not explain why the Netronome is behaving this way as it appears from your statement above the Rx side of Netronome is not working correcting. The machine failing to receive the packets is the testpmd/Netronome machine and not the machine running Pktgen, which means I do not believe it is a Pktgen problem.
As for why the RX side seems to start working after some period of time, I can not help you as I have never used that NIC. I believe is appears to be some anecdotal type data point and nothing to do with Pktgen, but I am happy to be proven wrong.
I would suggest you add a NIC to the testpmd machine like an Intel NIC and test to see if it fails in the same way. If it does fail in the same way then change testpmd to Pktgen on both machines and if that fails we can start investigating the problem.
Remember that Pktgen is nothing special when sending or receiving packets as it uses DPDK for all RX/TX packets. If Pktgen works on the two machines with a Intel NIC then it is most likely the Netronome NIC or the DPDK PMD for that NIC.
In fact, I have did some further tests which can prove the busrt traffic on pktgen startup: (1) I tried add a switch device between the send and receive side, the switch can show the traffic throughput, which presents the tx traffic with a sudden burst on startup, and then down to a steady value. (2) I tried adjusting the burst parameter, the default is 64, and when it down to a specific value, there's no discarding happen. (3) I tried reset the pktgen statistics 10s after startup, the discarding does not happen after that. Yes, I think there is indeed some problem with testpmd/netronome. its performance might be lower than other cards', and cannot handle the burst traffic well. Maybe this is not a big problem, but I'm just curious about the startup burst (and the burst cause the real tx ratio more higer than I set), why pktgen cannot send evenly as hardware tester at startup time, is there any special design on this? Anyway, thanks for your patient and kindly response.
What was the burst value to set pktgen to to make it work number 2 above? I did not follow number 3, does it mean you started pktgen and then waited 10sec reset the statistics and it worked fine?
Pktgen uses the CPU timestamp counter (rdtsc) instruction to get the current clock ticks in CPU cycles. This means pktgen will never be a good as a hardware traffic generator as software is managing the traffic.
With that stated Pktgen should not send a burst of traffic at the start and then drop down in speed. I will look out for this issue and see if I can see the problem.
Can you attach or post a screen shot of pktgen just after startup. If you can copy/paste the text of the main screeen that can work too.
What was the burst value to set pktgen to to make it work number 2 above?
The value is different according to pkt size, for example, on my server, 64@byte, the burst value need to be lower than 3.5.
I did not follow number 3, does it mean you started pktgen and then waited 10sec reset the statistics and it worked fine?
Yes, correct.
~/dpdk-testpmd -l 0,63 -n 4 -a 0000:84:00.0 --socket-mem 0,2048 --proc-type auto -- --portmask 0x1 --rss-ip --nb-cores=1 --rxq=1 --txq=1 --rxd=1024 --txd=1024 --burst=64 --forward-mode=macswap -i
./build/app/pktgen -l 0-63 -n 4 --proc-type auto --socket-mem 2048 -- -P -m "[62:63].[0:0]"
pktgen settings: set 0 src mac 02:00:00:00:00:00 set 0 dst mac 88:3C:C5:A0:01:78 set 0 count 4000000 set 0 rate 5 set 0 size 64
testpmd> show port stats 0
######################## NIC statistics for port 0 ######################## RX-packets: 4000000 RX-missed: 0 RX-bytes: 240000000 RX-errors: 0 RX-nombuf: 0 TX-packets: 3998821 TX-errors: 0 TX-bytes: 239929260
Throughput (since last show) Rx-pps: 0 Rx-bps: 0 Tx-pps: 0 Tx-bps: 0 ############################################################################
The pktgen command line has too many cores, it may not change the problem, but use the following.
./build/app/pktgen -l 0,62-63 -n 4 --proc-type auto --socket-mem 2048 -- -P -m "[62:63].0"
The -l 0-63 allocated 64 of the cores to Pktgen and you only needed 3 cores.
The above statement
The value is different according to pkt size, for example, on my server, 64@byte, the burst value need to be lower than 3.5.
Is the 3.5 a typo as you can not have 3.5 for the burst size. You can have 3 or 5.
Can you please send me a screenshot of the pktgen screen just after you start Pktgen and then after you wait 10sec after the reset. A copy/paste of the screen should work fine.
The 3.5 is not a typo, I can set decimals
Can you please send me a screenshot of the pktgen screen just after you start Pktgen and then after you wait 10sec after the reset. A copy/paste of the screen should work fine.
Before reset statistics: After reset statistics:
There's an issue while test a NIC with pktgen, it always discards about 400000 packages during startup time, and then the traffic can handled well. This has relation to the tested NIC performance. But I think it has relation to the mechanism of pktgen. After I set a lower burst parameter, this discarding can be suppressed, but too low burst value will have other affecting. Because of this startup bursting, it's hard to get the throughput result while we testing such NIC. We must set a very higher pkts count value (more than 400,000,000) to get a more accurate result. Is it possible to fix this? make pktgen more evenly sending traffic on startup?