tiglabs / jupiter

Jupiter is a high-performance 4-layer network load balance service based on DPDK.
MIT License
332 stars 135 forks source link

启动 报 RX queue 错误 #22

Closed dearblen closed 5 years ago

dearblen commented 5 years ago

您好:配置完成后报错: Jun 4 16:45:11 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: Multi-process socket /var/run/.rte_unix Jun 4 16:45:11 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: Probing VFIO support... Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: PCI device 0000:00:1f.6 on NUMA socket -1 Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: Invalid NUMA socket, default to 0 Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: probe driver: 8086:15e3 net_e1000_em Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: PCI device 0000:02:00.0 on NUMA socket -1 Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: Invalid NUMA socket, default to 0 Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: EAL: probe driver: 8086:100e net_e1000_em Jun 4 16:45:12 sz-pg-oam-docker-test-002 kernel: igb_uio 0000:02:00.0: uio device registered with irq 129 Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: USER1: init_laddr_list(): The number of local IPv4 is less than the number of RX queue of jupiter0. Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: USER1: lb_device_init(): init laddr list failed. Jun 4 16:45:12 sz-pg-oam-docker-test-002 jupiter-service[15329]: USER1: main(): lb_device_init failed.

请问: RX queue 是什么?

我的网卡驱动: 02:00.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)

dearblen commented 5 years ago

另一个错误: Jun 4 16:52:59 sz-pg-oam-docker-test-002 jupiter-service[15402]: USER1: dpdk_dev_config_and_set_ipfilter(): config port0 failed, Invalid argument. Jun 4 16:52:59 sz-pg-oam-docker-test-002 jupiter-service[15402]: USER1: lb_device_init(): config dpdk dev or add ip filter failed, port_id=0. Jun 4 16:52:59 sz-pg-oam-docker-test-002 jupiter-service[15402]: USER1: main(): lb_device_init failed. Jun 4 16:52:59 sz-pg-oam-docker-test-002 NetworkManager[4568]: [1559638379.8040] manager: (jupiter0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/9)

muziding commented 5 years ago

@dearblen 配置的local IP的数目需要大于等于配置的cpu核数(每个核对应一个网卡接收队列rxq)

muziding commented 5 years ago

@dearblen 第二个错误需要多点信息判断

dearblen commented 5 years ago

以下是我的启动脚本

!/bin/bash

mkdir -p /mnt/huge mount -t hugetlbfs nodev /mnt/huge echo 4096 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages modprobe uio insmod /usr/share/jupiter/kmod/igb_uio.ko /usr/share/jupiter/tools/dpdk-devbind.py --bind=igb_uio p3p1 insmod /usr/share/jupiter/kmod/rte_kni.ko jupiter-service --daemon

日志信息: Jun 5 15:30:54 sz-pg-oam-docker-test-002 dhclient[6331]: DHCPDISCOVER on p3p1 to 255.255.255.255 port 67 interval 19 (xid=0x4f5f22d2) Jun 5 15:30:58 sz-pg-oam-docker-test-002 kernel: igb_uio: loading out-of-tree module taints kernel. Jun 5 15:30:58 sz-pg-oam-docker-test-002 kernel: igb_uio: module verification failed: signature and/or required key missing - tainting kernel Jun 5 15:30:58 sz-pg-oam-docker-test-002 kernel: igb_uio: Use MSIX interrupt by default Jun 5 15:30:59 sz-pg-oam-docker-test-002 dhclient[6331]: receive_packet failed on p3p1: Network is down Jun 5 15:30:59 sz-pg-oam-docker-test-002 NetworkManager[4575]: [1559719859.1617] device (p3p1): state change: ip-config -> unmanaged (reason 'removed', sys-iface-state: 'removed') Jun 5 15:30:59 sz-pg-oam-docker-test-002 NetworkManager[4575]: [1559719859.1700] dhcp4 (p3p1): canceled DHCP transaction, DHCP client pid 6331 Jun 5 15:30:59 sz-pg-oam-docker-test-002 NetworkManager[4575]: [1559719859.1700] dhcp4 (p3p1): state changed unknown -> done Jun 5 15:30:59 sz-pg-oam-docker-test-002 kernel: igb_uio 0000:02:00.0: mapping 1K dma=0x21de7a000 host=ffff982f5de7a000 Jun 5 15:30:59 sz-pg-oam-docker-test-002 kernel: igb_uio 0000:02:00.0: unmapping 1K dma=0x21de7a000 host=ffff982f5de7a000 Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: Multi-process socket /var/run/.rte_unix Jun 5 15:30:59 sz-pg-oam-docker-test-002 kernel: Bits 55-60 of /proc/PID/pagemap entries are about to stop being page-shift some time soon. See the linux/Documentation/vm/pagemap.txt for details. Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: Probing VFIO support... Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: PCI device 0000:00:1f.6 on NUMA socket -1 Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: Invalid NUMA socket, default to 0 Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: probe driver: 8086:15e3 net_e1000_em Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: PCI device 0000:02:00.0 on NUMA socket -1 Jun 5 15:30:59 sz-pg-oam-docker-test-002 kernel: igb_uio 0000:02:00.0: uio device registered with irq 129 Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: Invalid NUMA socket, default to 0 Jun 5 15:30:59 sz-pg-oam-docker-test-002 jupiter-service[6371]: EAL: probe driver: 8086:100e net_e1000_em Jun 5 15:31:00 sz-pg-oam-docker-test-002 jupiter-service[6371]: KNI: pci: 02:00:00 #011 8086:100e Jun 5 15:31:00 sz-pg-oam-docker-test-002 kernel: rte_kni: Creating kni... Jun 5 15:31:00 sz-pg-oam-docker-test-002 jupiter-service[6371]: USER1: dpdk_dev_config_and_set_ipfilter(): config port0 failed, Invalid argument. Jun 5 15:31:00 sz-pg-oam-docker-test-002 jupiter-service[6371]: USER1: lb_device_init(): config dpdk dev or add ip filter failed, port_id=0. Jun 5 15:31:00 sz-pg-oam-docker-test-002 jupiter-service[6371]: USER1: main(): lb_device_init failed. Jun 5 15:31:00 sz-pg-oam-docker-test-002 NetworkManager[4575]: [1559719860.0216] manager: (jupiter0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/4)

我的配置: [DPDK] argv = -c 0xf -n 4

[DEVICE0] name = jupiter0 ipv4 = 172.16.2.2 netmask = 255.255.0.0 gw = 172.16.1.1 rxqsize = 256 txqsize = 512 mtu = 1500 rxoffload = 0 txoffload = 0 local-ipv4 = 172.16.2.0/24 pci = 0000:02:00.0

我的网卡信息: [root@sz-pg-oam-docker-test-002 ~]# lspci | grep Ethernet 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (5) I219-LM 02:00.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02

[root@sz-pg-oam-docker-test-002 ~]# ifconfig enp0s31f6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.100.6.212 netmask 255.255.255.0 broadcast 10.100.6.255 inet6 fe80::a8a3:a4a3:5a79:e3b7 prefixlen 64 scopeid 0x20 ether 4c:76:25:fd:4c:03 txqueuelen 1000 (Ethernet) RX packets 20901 bytes 1701142 (1.6 MiB) RX errors 0 dropped 190 overruns 0 frame 0 TX packets 5333 bytes 578285 (564.7 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 20 memory 0xef100000-ef120000

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10 loop txqueuelen 1000 (Local Loopback) RX packets 64 bytes 5632 (5.5 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 64 bytes 5632 (5.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

我的系统信息: Linux sz-pg-oam-docker-test-002.tendcloud.com 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

muziding commented 5 years ago

@dearblen 确认下82540EM网卡是否支持多队列吧,如果不支持,那么就只配置两个cpu核就可以了

dearblen commented 5 years ago

@muziding 感谢您的帮助 我的配置改成这样 问题依旧: [DPDK] argv = -c 0xf -n 2

[DEVICE0] name = jupiter0 ipv4 = 172.16.2.2 netmask = 255.255.0.0 gw = 172.16.1.1 rxqsize = 256 txqsize = 512 mtu = 1500 rxoffload = 0 txoffload = 0 local-ipv4 = 172.16.2.0/24 pci = 0000:02:00.0

检查网卡: 02:00.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02) Subsystem: Intel Corporation PRO/1000 MT Desktop Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- SERR- <PERR- INTx- Latency: 32 (63750ns min) Interrupt: pin A routed to IRQ 19 Region 0: Memory at ef040000 (32-bit, non-prefetchable) [size=128K] Region 1: Memory at ef020000 (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at e000 [size=64] Expansion ROM at ef000000 [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device Command: DPERE- ERO+ RBC=512 OST=1 Status: Dev=00:00.0 64bit- 133MHz- SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz- Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Kernel driver in use: e1000 Kernel modules: e1000

[root@sz-pg-oam-docker-test-002 ~]# ethtool -i p3p1 driver: e1000 version: 7.3.21-k8-NAPI firmware-version: expansion-rom-version: bus-info: 0000:02:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no

muziding commented 5 years ago

@dearblen 配置两个cpu核的方法参考下dpdk文档: [DPDK] argv = -c 0x3 -n 4

你的网卡应该不支持多队列,你可以使用ipv6分支,注释掉MULTI_TXQ宏,然后再编译安装。 如果还是使用目前的版本,可能需要你按照ipv6分支的解决方法修改下代码。