Open mdr78 opened 5 years ago
You are probably running into a hardware limit here; I don't have an X710 but my best guess for its architecture is that it's just an XL710 configured as 4x10 on one of the 40G ports or both ports in 2x10 mode.
Based on this I'd guess you should be able to achieve around 30-40 Mpps in total. Can you try a few different configurations?
For two ports: is there a difference between using ports (0 and 1 ) and (0 and 3)?
2 ports, 1 core & 1 queue each
[root@silpixa00396680 MoonGen]# ./build/MoonGen examples/pktgen.lua -t 1 -s 10 --dpdk-config=/ro[33/1865$
XL710/dpdk-conf.lua 0 1
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 4 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 2: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 3: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator
[INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up...
[INFO] Device 1 (3C:FD:FE:9D:88:F9) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s
[INFO] 2 devices are up.
[INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 2 on [TxQueue: id=1, qid=0] sending to peer 3c:fd:fe:9d:68:b9
...
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=1] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 6.58 Mpps, 3369 Mbit/s (4422 Mbit/s with framing)
[Device: id=1] TX: 6.60 Mpps, 3378 Mbit/s (4433 Mbit/s with framing)
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=1] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=0] TX: 7.36 (StdDev 0.31) Mpps, 3770 (StdDev 159) Mbit/s (4948 Mbit/s with framing), total 73917774 packets with 4730737536 bytes (incl. CRC)
[Device: id=1] TX: 7.39 (StdDev 0.31) Mpps, 3783 (StdDev 160) Mbit/s (4965 Mbit/s with framing), total 74096568 packets with 4742180352 bytes (incl. CRC)
2 ports, 2 cores and 2 queues each
[root@silpixa00396680 MoonGen]# ./build/MoonGen examples/pktgen.lua -t 2 -s 10 --dpdk-config=/ro[51/1807]
XL710/dpdk-conf.lua 0 1
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 4 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 2: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 3: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator
[INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up... [INFO] Device 1 (3C:FD:FE:9D:88:F9) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s [INFO] 2 devices are up.
[INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 2 on [TxQueue: id=0, qid=1] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 3 on [TxQueue: id=1, qid=0] sending to peer 3c:fd:fe:9d:68:b9
[INFO] Starting Thread 4 on [TxQueue: id=1, qid=1] sending to peer 3c:fd:fe:9d:68:b9
...
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=1] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 5.00 Mpps, 2558 Mbit/s (3357 Mbit/s with framing)
[Device: id=1] TX: 5.02 Mpps, 2570 Mbit/s (3373 Mbit/s with framing)
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=1] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=0] TX: 5.65 (StdDev 0.25) Mpps, 2892 (StdDev 126) Mbit/s (3796 Mbit/s with framing), total 56593404 packets with 3621977856 bytes (incl. CRC)
[Device: id=1] TX: 5.67 (StdDev 0.24) Mpps, 2904 (StdDev 125) Mbit/s (3811 Mbit/s with framing), total 56993454 packets with 3647581056 bytes (incl. CRC)
2 ports, 4 cores and 4 queues each
[root@silpixa00396680 MoonGen]# ./build/MoonGen examples/pktgen.lua -t 4 -s 10 --dpdk-config=/ro[55/1963$
XL710/dpdk-conf.lua 0 1
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 4 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 2: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 3: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator
[INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up...
[INFO] Device 1 (3C:FD:FE:9D:88:F9) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s
[INFO] 2 devices are up.
[INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 2 on [TxQueue: id=0, qid=1] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 3 on [TxQueue: id=0, qid=2] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 4 on [TxQueue: id=0, qid=3] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 5 on [TxQueue: id=1, qid=0] sending to peer 3c:fd:fe:9d:68:b9
[INFO] Starting Thread 6 on [TxQueue: id=1, qid=1] sending to peer 3c:fd:fe:9d:68:b9
[INFO] Starting Thread 7 on [TxQueue: id=1, qid=2] sending to peer 3c:fd:fe:9d:68:b9
[INFO] Starting Thread 8 on [TxQueue: id=1, qid=3] sending to peer 3c:fd:fe:9d:68:b9
...
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=1] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing) [Device: id=0] TX: 4.72 Mpps, 2419 Mbit/s (3175 Mbit/s with framing)
[Device: id=1] TX: 4.73 Mpps, 2420 Mbit/s (3176 Mbit/s with framing)
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=1] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=0] TX: 5.51 (StdDev 0.29) Mpps, 2821 (StdDev 151) Mbit/s (3703 Mbit/s with framing), total 55276074 packets with 3537668736 bytes (incl. CRC)
[Device: id=1] TX: 5.51 (StdDev 0.29) Mpps, 2822 (StdDev 151) Mbit/s (3704 Mbit/s with framing), total 55352682 packets with 3542571648 bytes (incl. CRC)
4 ports, 1 cores and 1 queues each
[root@silpixa00396680 MoonGen]# ./build/MoonGen examples/pktgen.lua -t 1 -s 10 --dpdk-config=/ro[77/1865$
XL710/dpdk-conf.lua 0 1 2 3
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 4 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 2: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 3: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator
[INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up...
[INFO] Device 3 (3C:FD:FE:9D:88:FB) is up: 10000 MBit/s
[INFO] Device 2 (3C:FD:FE:9D:88:FA) is up: 10000 MBit/s
[INFO] Device 1 (3C:FD:FE:9D:88:F9) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s
[INFO] 4 devices are up.
...
[Device: id=0] TX: 2.46 Mpps, 1259 Mbit/s (1652 Mbit/s with framing)
[Device: id=1] TX: 2.44 Mpps, 1249 Mbit/s (1640 Mbit/s with framing)
[Device: id=2] TX: 2.44 Mpps, 1248 Mbit/s (1637 Mbit/s with framing)
[Device: id=3] TX: 2.48 Mpps, 1269 Mbit/s (1666 Mbit/s with framing)
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=1] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=2] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=3] RX: 11.09 (StdDev 0.48) Mpps, 5678 (StdDev 245) Mbit/s (7452 Mbit/s with framing), total 111287647 packets with 7122409408 bytes (incl. CRC)
[Device: id=0] TX: 2.78 (StdDev 0.12) Mpps, 1423 (StdDev 61) Mbit/s (1867 Mbit/s with framing), total 27923805 packets with 1787123520 bytes (incl. CRC)
[Device: id=1] TX: 2.75 (StdDev 0.12) Mpps, 1410 (StdDev 60) Mbit/s (1851 Mbit/s with framing), total 27631674 packets with 1768427136 bytes (incl. CRC)
[Device: id=2] TX: 2.76 (StdDev 0.12) Mpps, 1411 (StdDev 62) Mbit/s (1853 Mbit/s with framing), total 27613404 packets with 1767257856 bytes (incl. CRC)
[Device: id=3] TX: 2.80 (StdDev 0.12) Mpps, 1434 (StdDev 62) Mbit/s (1882 Mbit/s with framing), total 28124019 packets with 1799937216 bytes (incl. CRC)
2 ports (0 and 3), 1 core & 1 queue each
[root@silpixa00396680 MoonGen]# ./build/MoonGen examples/pktgen.lua -t 1 -s 10 --dpdk-config=/ro[49/1963]
XL710/dpdk-conf.lua 0 3
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 4 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 2: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 3: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator [INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up... [INFO] Device 3 (3C:FD:FE:9D:88:FB) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s [INFO] 2 devices are up.
[INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8 [INFO] Starting Thread 2 on [TxQueue: id=3, qid=0] sending to peer 3c:fd:fe:9d:68:b9
...
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=3] RX: 6.53 Mpps, 3346 Mbit/s (4391 Mbit/s with framing)
[Device: id=0] TX: 6.53 Mpps, 3344 Mbit/s (4390 Mbit/s with framing)
[Device: id=3] TX: 5.49 Mpps, 2810 Mbit/s (3688 Mbit/s with framing)
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=3] RX: 7.31 (StdDev 0.29) Mpps, 3745 (StdDev 150) Mbit/s (4916 Mbit/s with framing), total 73540278 packets with 4706577792 bytes (incl. CRC)
[Device: id=0] TX: 7.31 (StdDev 0.29) Mpps, 3745 (StdDev 151) Mbit/s (4915 Mbit/s with framing), total 73540278 packets with 4706577792 bytes (incl. CRC)
[Device: id=3] TX: 6.12 (StdDev 0.24) Mpps, 3135 (StdDev 122) Mbit/s (4115 Mbit/s with framing), total 61248663 packets with 3919914432 bytes (incl. CRC)
Can you post the output of lspci -vvv -s 0000:d8:00.0
Another thing to test would be using multiple processes; there's an example in dpdk-conf.lua, just assign different cores and a different whitelist to each process.
Note that this should show the exact same behavior: tasks in MoonGen are more independent than your usual thread as they run in a completely different LuaJIT VM. It's still worth testing.
lspci -vvv -s 0000:d8:00.0
[root@xxx ~]# lspci -vvv -s 0000:d8:00.0
d8:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
Subsystem: Intel Corporation Ethernet Converged Network Adapter X710-4
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 1024
NUMA node: 1
Region 0: Memory at f0000000 (64-bit, prefetchable) [size=8M]
Region 3: Memory at f1018000 (64-bit, prefetchable) [size=32K]
Expansion ROM at f1400000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] MSI-X: Enable- Count=129 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00001000
Capabilities: [a0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <16us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [e0] Vital Product Data
Product Name: XL710 40GbE Controller
Read-only fields:
[PN] Part number:
[EC] Engineering changes:
[FG] Unknown:
[LC] Unknown:
[MN] Manufacture ID:
[PG] Unknown:
[SN] Serial number:
[V0] Vendor specific:
[RV] Reserved: checksum good, 0 byte(s) reserved
Read/write fields:
[V1] Vendor specific:
End
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [140 v1] Device Serial Number f8-88-9d-ff-ff-fe-fd-3c
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 1
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration-, Interrupt Message Number: 000
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
IOVSta: Migration-
Initial VFs: 32, Total VFs: 32, Number of VFs: 0, Function Dependency Link: 00
VF offset: 16, stride: 1, Device ID: 154c
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 00000000f0e00000 (64-bit, prefetchable)
Region 3: Memory at 00000000f11a0000 (64-bit, prefetchable)
VF Migration: offset: 00000000, BIR: 0
Capabilities: [1a0 v1] Transaction Processing Hints
Device specific mode supported
No steering table available
Capabilities: [1b0 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Capabilities: [1d0 v1] #19
Kernel driver in use: igb_uio
Kernel modules: i40e
[root@xxx ~]#
When running in separate processes is better.
When I run a separate instances of Moongen for each pair of ports (four ports total, in two processes), Each process instance achieves the same performance as a if I was only running with a single process on two ports. So there is some sort of contention that is being eliminated.
See below - I run Configuration 1 and Configuration 2 simultaneously, which shows that it is not a problem with the Ethernet Controller.
4 ports, 2 process, 1 port and core each
[root@xxxx MoonGen]# egrep -H 'cores|Whitelist' foo/*
foo/dpdk-conf1.lua: cores = {29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39},
foo/dpdk-conf1.lua: pciWhitelist = {"0000:d8:00.0","0000:d8:00.1"},
foo/dpdk-conf2.lua: cores = {40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53},
foo/dpdk-conf2.lua: pciWhitelist = {"0000:d8:00.2","0000:d8:00.3"},
Configuration 1
[root@xxxx MoonGen]# ./build/MoonGen examples/pktgen.lua -t 1 -s 10 --dpdk-config=foo/dpdk-conf1.lua
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 2 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator [INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up...
[INFO] Device 1 (3C:FD:FE:9D:88:F9) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s
[INFO] 2 devices are up.
[INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 2 on [TxQueue: id=1, qid=0] sending to peer 3c:fd:fe:9d:68:b9
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 1 packets with 114 byte
s (incl. CRC)
[Device: id=1] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 1 packets with 114 byte
s (incl. CRC)
[Device: id=0] TX: 8.12 (StdDev 0.47) Mpps, 4155 (StdDev 238) Mbit/s (5454 Mbit/s with framing), total 81119367 packet
s with 5191639488 bytes (incl. CRC)
[Device: id=1] TX: 8.15 (StdDev 0.44) Mpps, 4173 (StdDev 224) Mbit/s (5477 Mbit/s with framing), total 81521811 packet
s with 5217395904 bytes (incl. CRC)
Configuration 2
[root@xxxx MoonGen]# ./build/MoonGen examples/pktgen.lua -t 1 -s 10 --dpdk-config=foo/dpdk-conf2.lua 0 1
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 2 usable devices:
Device 0: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator
[INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up...
[INFO] Device 1 (3C:FD:FE:9D:88:FB) is up: 10000 MBit/s
[INFO] Device 0 (3C:FD:FE:9D:88:FA) is up: 10000 MBit/s
[INFO] 2 devices are up.
[INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 2 on [TxQueue: id=1, qid=0] sending to peer 3c:fd:fe:9d:68:b9
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 1 packets with 114 byte
s (incl. CRC)
[Device: id=1] RX: 8.45 (StdDev 6.97) Mpps, 4325 (StdDev 3569) Mbit/s (5676 Mbit/s with framing), total 94306605 packe
ts with 6035595058 bytes (incl. CRC)
[Device: id=0] TX: 8.10 (StdDev 0.48) Mpps, 4148 (StdDev 247) Mbit/s (5444 Mbit/s with framing), total 80952165 packet
s with 5180938560 bytes (incl. CRC)
[Device: id=1] TX: 8.12 (StdDev 0.50) Mpps, 4156 (StdDev 255) Mbit/s (5455 Mbit/s with framing), total 81016803 packet
s with 5185075392 bytes (incl. CRC)
Doesn't appear to be related to the statistics thread. When I run with four ports however restrict the statistics to be calculated on only one port, performance is much the same.
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 112 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device 0000:d8:00.0 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.1 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.2 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
EAL: PCI device 0000:d8:00.3 on NUMA socket 1
EAL: probe driver: 8086:1572 net_i40e
[INFO] Found 4 usable devices:
Device 0: 3C:FD:FE:9D:88:F8 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 1: 3C:FD:FE:9D:88:F9 (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 2: 3C:FD:FE:9D:88:FA (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
Device 3: 3C:FD:FE:9D:88:FB (Intel Corporation Ethernet Controller X710 for 10GbE SFP+)
[INFO] Check out MoonGen (built on lm) if you are looking for a fully featured packet generator
[INFO] https://github.com/emmericp/MoonGen
[INFO] Waiting for devices to come up...
[INFO] Device 3 (3C:FD:FE:9D:88:FB) is up: 10000 MBit/s
[INFO] Device 2 (3C:FD:FE:9D:88:FA) is up: 10000 MBit/s
[INFO] Device 1 (3C:FD:FE:9D:88:F9) is up: 10000 MBit/s [INFO] Device 0 (3C:FD:FE:9D:88:F8) is up: 10000 MBit/s
[INFO] 4 devices are up. [INFO] Starting Thread 1 on [TxQueue: id=0, qid=0] sending to peer 3c:fd:fe:9d:68:b8
[INFO] Starting Thread 2 on [TxQueue: id=1, qid=0] sending to peer 3c:fd:fe:9d:68:b9
[INFO] Starting Thread 3 on [TxQueue: id=2, qid=0] sending to peer 3c:fd:fe:9d:68:ba
[INFO] Starting Thread 4 on [TxQueue: id=3, qid=0] sending to peer 3c:fd:fe:9d:68:bb
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.71 Mpps, 1390 Mbit/s (1824 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.86 Mpps, 1466 Mbit/s (1924 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.87 Mpps, 1467 Mbit/s (1926 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.86 Mpps, 1464 Mbit/s (1921 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.86 Mpps, 1464 Mbit/s (1922 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.86 Mpps, 1463 Mbit/s (1921 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.87 Mpps, 1469 Mbit/s (1928 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.86 Mpps, 1464 Mbit/s (1921 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.86 Mpps, 1464 Mbit/s (1922 Mbit/s with framing)
[Device: id=0] RX: 0.00 Mpps, 0 Mbit/s (0 Mbit/s with framing)
[Device: id=0] TX: 2.50 Mpps, 1279 Mbit/s (1679 Mbit/s with framing)
[Device: id=0] RX: 0.00 (StdDev 0.00) Mpps, 0 (StdDev 0) Mbit/s (0 Mbit/s with framing), total 0 packets with 0 bytes (incl. CRC)
[Device: id=0] TX: 2.82 (StdDev 0.12) Mpps, 1445 (StdDev 62) Mbit/s (1896 Mbit/s with framing), total 28288134 packets with 1810440576 bytes (incl. CRC)
@mdr78 Can you tell me why the RX stats are 0.0? I have trouble getting RX stats. Any idea? This is an old issue. Is that expected behavior thought?
this is unlikely to be related; the main reason for not getting stats is using a virtual NIC that drops packets earlier on the hypervisor, in this case the you'll have to actually receive and drop the packets in order for the stats to work
Hi folks,
So having a problem scaling Moongen with multiple interfaces.
I am using the pktgen.lua code, with a single modification - to use a pre-defined MAC address depending on the port. I am using an Intel X710, a 4x10Gb Ethernet Interface with the 6.0.1 firmware.
When I use a single 10G port on the card, I get 14mpps no problem. When I get a second port performance significantly degrades and then continues to degrade as I add interfaces. This is not due lack of cores or queues - I am throwing plenty of those at MoonGen
The interesting bit I have modified in pktgen.lua is as follows.
The DPDK Config makes lots of cores and memory on the same socket available to Moongen, as follows
When I run Moongen with 2 threads on 1 port, I get 14mpps no problem at all. Using htop I see that 2 cores are loaded at 100%.
When I run Moongen with 2 threads each on 4 ports, performance degrades to about 3mpps per port. Using htop I see that 8 cores are loaded at 100%
BTW: love MoonGen, it rocks.
Ray K