ntop / PF_RING

High-speed packet processing framework
http://www.ntop.org
GNU Lesser General Public License v2.1
2.67k stars 353 forks source link

ZC: reporting Tbit traffic on 10Gbe adapter #955

Closed DerRealKeyser closed 3 weeks ago

DerRealKeyser commented 3 weeks ago

I just updated NtopNG and pfring today to the current stable build, and now my pfring ZC has gone completely bonkers and reports Tbit throughput on 10Gbe NICs - sending NtopNG interfaces threads in 100% CPU.

Doing a pfcount on an Intel 82599 interface running ZC reports:

sudo pfcount -i zc:ens1f0 Using PF_RING v.8.8.0.240805 kernel module v.8.8.0 Dumping statistics on /proc/net/pf_ring/stats/12741-ens1f0.1 Capturing from zc:ens1f0 [mac: 48:DF:37:1E:47:EC][if_index: 8][speed: 10000Mb/s]

Device RX channels: 4

Polling threads: 1

========================= Absolute Stats: [4'584'998 pkts total][0 pkts dropped][0.0% dropped] [4'584'998 pkts rcvd][300'569'543'890 bytes rcvd]

========================= Absolute Stats: [9'973'904 pkts total][2 pkts dropped][0.0% dropped] [9'973'902 pkts rcvd][653'839'145'610 bytes rcvd][9'973'363.43 pkt/sec][5'230'430.72 Mbit/sec]

Actual Stats: [5'388'904 pkts rcvd][1'000.05 ms][5'388'613.01 pps][2'826.00 Gbps]

========================= Absolute Stats: [15'342'948 pkts total][3 pkts dropped][0.0% dropped] [15'342'945 pkts rcvd][5'806'759'475 bytes rcvd][7'671'119.62 pkt/sec][4'023'041.97 Mbit/sec]

Actual Stats: [5'369'043 pkts rcvd][1'000.03 ms][5'368'838.98 pps][2'815.63 Gbps]

========================= Absolute Stats: [20'718'353 pkts total][3 pkts dropped][0.0% dropped] [20'718'350 pkts rcvd][358'191'434'250 bytes rcvd][6'905'794.39 pkt/sec][3'621'674.81 Mbit/sec]

Actual Stats: [5'375'405 pkts rcvd][1'000.04 ms][5'375'146.99 pps][2'818.94 Gbps]

========================= Absolute Stats: [21'791'736 pkts total][3 pkts dropped][0.0% dropped] [21'791'733 pkts rcvd][428'557'056'815 bytes rcvd][6'810'540.15 pkt/sec][3'571'719.67 Mbit/sec]

Actual Stats: [1'073'383 pkts rcvd][199.57 ms][5'378'559.58 pps][2'820.73 Gbps]

When running this pfcount the Linux console is splashed with errors like these: Aug 20 11:56:27 ntopng kernel: [ 2788.506319] DMAR: DRHD: handling fault status reg 502 Aug 20 11:56:27 ntopng kernel: [ 2788.506924] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x7821f87400000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.508043] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x7821f87400000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.509209] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x1140187400000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.510394] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x1140187400000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.511576] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x7821f87400000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.512802] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x1140187400000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.514026] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x25e8288c00000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.515327] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x5bb8585c00000 [fault reason 0x04] Access beyond MGAW Aug 20 11:56:27 ntopng kernel: [ 2788.516613] DMAR: [DMA Write NO_PASID] Request device [08:00.0] fault addr 0x7821f87400000 [fault reason 0x04] Access beyond MGAW

Any Ideas?

cardigliano commented 3 weeks ago

@DerRealKeyser did you change something in the Bios/grub configuration? What is the CPU model?

DerRealKeyser commented 3 weeks ago

No - all I did was an "apt update" and "apt upgrade". The upgrade did include a kernel upgrade as well. It was upgraded to 6.8.0-40-Generic

DerRealKeyser commented 3 weeks ago

I tried removing NtopNG and pfRing completely and reinstall them - no change, still the same issue.

cardigliano commented 3 weeks ago

CPU model?

DerRealKeyser commented 3 weeks ago

Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (Single CPU server). Server: HPE DL380 G9

cardigliano commented 3 weeks ago

Please try disabling iommu as described at https://www.ntop.org/guides/pf_ring/intel.html

DerRealKeyser commented 3 weeks ago

That did the trick. I disabled virtualization and vt-d in BIOS, and now pfcount show "real" numbers and NtopNG works again when started. Thanks :-) Do you know what actually goes wrong there?

cardigliano commented 3 weeks ago

The IOMMU creates issues with ZC as it requires kernel to userspace memory mapping which conflicts with the IOMMU support