geerlingguy / raspberry-pi-pcie-devices

Raspberry Pi PCI Express device compatibility database
http://pipci.jeffgeerling.com
GNU General Public License v3.0
1.53k stars 138 forks source link

Test Mellanox ConnectX-3 EN 10 Gigabit Ethernet (CX311A) card #143

Closed geerlingguy closed 3 years ago

geerlingguy commented 3 years ago

I was sent two copies of this card by @albydnc — it's the Mellanox ConnectX-3 EN 10 GbE (CX311A) card, and it's a generation newer than the ConnectX-2 card I was previously testing (with no luck - https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/21).

DSC01257

Related: https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/139

Here's a detailed user guide: https://www.mellanox.com/related-docs/user_manuals/ConnectX-3_Ethernet_Dual_SFP+_Port_Adapter_Card_User_Manual.pdf

geerlingguy commented 3 years ago
pi@raspberrypi:~ $ sudo lspci -vvvv

01:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
    Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Interrupt: pin A routed to IRQ 0
    Region 0: Memory at 600800000 (64-bit, non-prefetchable) [disabled] [size=1M]
    Region 2: Memory at 600000000 (64-bit, prefetchable) [disabled] [size=8M]
    [virtual] Expansion ROM at 600900000 [disabled] [size=1M]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [48] Vital Product Data
        Product Name: CX311A - ConnectX-3 SFP+
        Read-only fields:
            [PN] Part number: MCX311A-XCAT_A       
            [EC] Engineering changes: A6
            [SN] Serial number: MT1521K08870            
            [V0] Vendor specific: PCIe Gen3 x4
            [RV] Reserved: checksum good, 0 byte(s) reserved
        Read/write fields:
            [V1] Vendor specific: N/A   
            [YA] Asset tag: N/A                     
            [RW] Read-write area: 109 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 253 byte(s) free
            [RW] Read-write area: 252 byte(s) free
        End
    Capabilities: [9c] MSI-X: Enable- Count=128 Masked-
        Vector table: BAR=0 offset=0007c000
        PBA: BAR=0 offset=0007d000
    Capabilities: [60] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
        DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #8, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
        ARICap: MFVC- ACS-, Next Function: 0
        ARICtl: MFVC- ACS-, Function Group: 0
    Capabilities: [148 v1] Device Serial Number e4-1d-2d-03-00-7f-0a-c0
    Capabilities: [154 v2] Advanced Error Reporting
        UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
        AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
    Capabilities: [18c v1] #19
geerlingguy commented 3 years ago

I should note that I hacksawed the port on the CM4 IO Board. I have a spare now, so I didn't feel shame in doing it to another board :)

Though I did damage one of the contacts... oops.

geerlingguy commented 3 years ago

BARs assigned:

[    1.206244] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[    1.206271] brcm-pcie fd500000.pcie:   No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[    1.206340] brcm-pcie fd500000.pcie:      MEM 0x0600000000..0x063fffffff -> 0x00c0000000
[    1.206417] brcm-pcie fd500000.pcie:   IB MEM 0x0000000000..0x00ffffffff -> 0x0400000000
[    1.303179] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC)
[    1.303559] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00
[    1.303578] pci_bus 0000:00: root bus resource [bus 00-ff]
[    1.303595] pci_bus 0000:00: root bus resource [mem 0x600000000-0x63fffffff] (bus address [0xc0000000-0xffffffff])
[    1.303681] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400
[    1.303921] pci 0000:00:00.0: PME# supported from D0 D3hot
[    1.307593] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[    1.329566] pci 0000:01:00.0: [15b3:1003] type 00 class 0x020000
[    1.329883] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]
[    1.330078] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x007fffff 64bit pref]
[    1.330378] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x000fffff pref]
[    1.331760] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0000:00:00.0 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link)
[    1.344383] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    1.344429] pci 0000:00:00.0: BAR 9: assigned [mem 0x600000000-0x6007fffff 64bit pref]
[    1.344445] pci 0000:00:00.0: BAR 8: assigned [mem 0x600800000-0x6009fffff]
[    1.344466] pci 0000:01:00.0: BAR 2: assigned [mem 0x600000000-0x6007fffff 64bit pref]
[    1.344656] pci 0000:01:00.0: BAR 0: assigned [mem 0x600800000-0x6008fffff 64bit]
[    1.344844] pci 0000:01:00.0: BAR 6: assigned [mem 0x600900000-0x6009fffff pref]
[    1.344859] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.344880] pci 0000:00:00.0:   bridge window [mem 0x600800000-0x6009fffff]
[    1.344897] pci 0000:00:00.0:   bridge window [mem 0x600000000-0x6007fffff 64bit pref]
geerlingguy commented 3 years ago

Having trouble with Vagrant and VirtualBox today... seeing as I'm receiving an M1 Mac today, that gives me an excuse to translate the build environment into Docker :)

(Build with -j12 took 12m 24s... not bad but 3 minutes slower than the same in VirtualBox. Might have to play around with settings.)

geerlingguy commented 3 years ago

Using my cross-compile environment, I compiled a kernel with the following option via menuconfig:

Device Drivers
  -> Network device support
    -> Ethernet driver support
      -> Mellanox Devices
        -> Mellanox Technologies 1/10/40Gbit Ethernet support
albydnc commented 3 years ago

you should try to not use the linux kernel drivers but Mellanox OFED.This is the LTS version and should be able to work on the Pi.
Use the --add-kernel-support flag with mlnxofedinstall in order to compile for your specific kernel version.

geerlingguy commented 3 years ago

Interface is visible:

pi@raspberrypi:~ $ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether b8:27:eb:5c:89:43 brd ff:ff:ff:ff:ff:ff
    inet 10.0.100.119/24 brd 10.0.100.255 scope global dynamic noprefixroute eth0
       valid_lft 86314sec preferred_lft 75514sec
    inet6 fe80::8d92:8149:927d:2ecf/64 scope link 
       valid_lft forever preferred_lft forever
3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether b8:27:eb:74:f2:6c brd ff:ff:ff:ff:ff:ff
4: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e4:1d:2d:7f:0a:c0 brd ff:ff:ff:ff:ff:ff

Dmesg logs:

pi@raspberrypi:~ $ dmesg | grep mlx4_en
[   11.980361] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-0
[   11.980652] mlx4_en 0000:01:00.0: Activating port:1
[   11.985781] mlx4_en: 0000:01:00.0: Port 1: Using 4 TX rings
[   11.985792] mlx4_en: 0000:01:00.0: Port 1: Using 4 RX rings
[   11.986130] mlx4_en: 0000:01:00.0: Port 1: Initializing port
[   12.054608] mlx4_en: eth1: Steering Mode 1
geerlingguy commented 3 years ago

Plugged into 1G switch using FLYPRO Fiber 10G SFP+ RJ-45 transceiver:

IMG_4375

Got a link:

4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e4:1d:2d:7f:0a:c0 brd ff:ff:ff:ff:ff:ff
    inet 169.254.211.135/16 brd 169.254.255.255 scope global noprefixroute eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::37de:ce96:6850:73b1/64 scope link 
       valid_lft forever preferred_lft forever

But it's not picking up an IP address via DHCP. dmesg output:

[  219.416058] mlx4_en: eth1: Link Up
[  219.416222] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[  234.975258] ------------[ cut here ]------------
[  234.975305] NETDEV WATCHDOG: eth1 (mlx4_core): transmit queue 1 timed out
[  234.975394] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x3a0/0x3a8
[  234.975402] Modules linked in: cmac aes_arm64 bnep hci_uart btbcm bluetooth ecdh_generic ecc mlx4_en 8021q garp stp llc brcmfmac brcmutil sha256_generic cfg80211 vc4 rfkill raspberrypi_hwmon v3d bcm2835_codec(C) cec bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) v4l2_mem2mem gpu_sched drm_kms_helper videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 drm videobuf2_common drm_panel_orientation_quirks mlx4_core videodev snd_soc_core mc snd_compress snd_bcm2835(C) snd_pcm_dmaengine vc_sm_cma(C) snd_pcm snd_timer snd syscopyarea sysfillrect sysimgblt fb_sys_fops backlight rpivid_mem uio_pdrv_genirq uio nvmem_rmem i2c_dev ip_tables x_tables ipv6
[  234.975737] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G         C        5.10.39-v8+ #1
[  234.975744] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[  234.975755] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[  234.975765] pc : dev_watchdog+0x3a0/0x3a8
[  234.975773] lr : dev_watchdog+0x3a0/0x3a8
[  234.975780] sp : ffffffc0115abd10
[  234.975787] x29: ffffffc0115abd10 x28: ffffff804f773f40 
[  234.975803] x27: 0000000000000004 x26: 0000000000000140 
[  234.975819] x25: 00000000ffffffff x24: 0000000000000000 
[  234.975834] x23: ffffffc011286000 x22: ffffff804f7403dc 
[  234.975849] x21: ffffff804f740000 x20: ffffff804f740480 
[  234.975864] x19: 0000000000000001 x18: 0000000000000000 
[  234.975878] x17: 0000000000000000 x16: 0000000000000000 
[  234.975892] x15: ffffffffffffffff x14: ffffffc011288948 
[  234.975907] x13: ffffffc01146ebd0 x12: ffffffc011315430 
[  234.975921] x11: 0000000000000003 x10: ffffffc0112fd3f0 
[  234.975937] x9 : ffffffc0100e5358 x8 : 0000000000017fe8 
[  234.975952] x7 : c0000000ffffefff x6 : 0000000000000003 
[  234.975966] x5 : 0000000000000000 x4 : 0000000000000000 
[  234.975988] x3 : 0000000000000103 x2 : 0000000000000102 
[  234.976003] x1 : 0ffd7fb07c3b0200 x0 : 0000000000000000 
[  234.976018] Call trace:
[  234.976028]  dev_watchdog+0x3a0/0x3a8
[  234.976044]  call_timer_fn+0x38/0x200
[  234.976054]  run_timer_softirq+0x298/0x548
[  234.976064]  __do_softirq+0x1a8/0x510
[  234.976074]  irq_exit+0xe8/0x108
[  234.976084]  __handle_domain_irq+0xa0/0x110
[  234.976092]  gic_handle_irq+0xb0/0xf0
[  234.976100]  el1_irq+0xc8/0x180
[  234.976113]  arch_cpu_idle+0x18/0x28
[  234.976122]  default_idle_call+0x58/0x1d4
[  234.976133]  do_idle+0x25c/0x270
[  234.976143]  cpu_startup_entry+0x2c/0x70
[  234.976153]  rest_init+0xd8/0xe8
[  234.976163]  arch_call_rest_init+0x18/0x24
[  234.976171]  start_kernel+0x544/0x578
[  234.976178] ---[ end trace 5eeedfc333596788 ]---
[  234.976220] mlx4_en: eth1: TX timeout on queue: 1, QP: 0x209, CQ: 0x85, Cons: 0xffffffff, Prod: 0x2
[  235.040982] mlx4_en: eth1: Steering Mode 1
[  235.050613] mlx4_en: eth1: Link Down
[  235.082513] mlx4_en: eth1: Link Up
[  250.847346] mlx4_en: eth1: TX timeout on queue: 2, QP: 0x20a, CQ: 0x86, Cons: 0xffffffff, Prod: 0x1
[  250.908951] mlx4_en: eth1: Steering Mode 1
[  250.918453] mlx4_en: eth1: Link Down
[  250.955984] mlx4_en: eth1: Link Up
[  280.799556] mlx4_en: eth1: TX timeout on queue: 1, QP: 0x209, CQ: 0x85, Cons: 0xffffffff, Prod: 0x2
[  280.861423] mlx4_en: eth1: Steering Mode 1
[  280.871053] mlx4_en: eth1: Link Down
[  280.908436] mlx4_en: eth1: Link Up
[  311.007773] mlx4_en: eth1: TX timeout on queue: 1, QP: 0x209, CQ: 0x85, Cons: 0xffffffff, Prod: 0x1
[  311.069564] mlx4_en: eth1: Steering Mode 1
[  311.079797] mlx4_en: eth1: Link Down
[  311.116368] mlx4_en: eth1: Link Up
[  326.879904] mlx4_en: eth1: TX timeout on queue: 1, QP: 0x209, CQ: 0x85, Cons: 0xffffffff, Prod: 0x1
[  326.941518] mlx4_en: eth1: Steering Mode 1
[  326.951201] mlx4_en: eth1: Link Down
[  326.988620] mlx4_en: eth1: Link Up
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 5.10.39-v8+ #1 SMP PREEMPT Tue May 25 15:15:52 UTC 2021 aarch64 GNU/Linux
albydnc commented 3 years ago

@geerlingguy this issue is probably due to the fact that these cards do not support lower than 10GbE links.

Doridian commented 3 years ago

@geerlingguy this issue is probably due to the fact that these cards do not support lower than 10GbE links.

I have the same issue on my dual-port actually plugged into a 10GbE link. With tcpdump I can see receiving packets (like broadcasts etc), but transmitting any packet times out. Tried that with SR Fiber and Passive DACs, same symptoms on both.

geerlingguy commented 3 years ago

@albydnc - I should note that @Doridian seems to be having a similar issue with a dual-port card in https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/139 — but not to say 1G shouldn't work. I know the transceiver at least supports 1 Gbps since I used it that way in my MikroTik router for a time. And in the spec sheet for the ConnectX-3, it shows:

Protocol support
  - Data Rate: 1/10Gb/s – Ethernet
geerlingguy commented 3 years ago

/me wonders: what are the chances Nvidia would care to help out with the driver support? Heh... answer so far with all other things involving Nvidia / open source seems to be "zero" :(

Doridian commented 3 years ago

/me wonders: what are the chances Nvidia would care to help out with the driver support? Heh... answer so far with all other things involving Nvidia / open source seems to be "zero" :(

Well this time you have one advantage over GPUs....the opensource in-tree drivers are actually pretty good. So maybe one of the maintainers of that?

albydnc commented 3 years ago

@Doridian @geerlingguy the driver in linux is not so reliable... I had problems on Connectx5 and 6. In the field, when this happens, the answer is always OFED, since it also has tons of debugging features, too.

Doridian commented 3 years ago

@Doridian @geerlingguy the driver in linux is not so reliable... I had problems on Connectx5 and 6. In the field, when this happens, the answer is always OFED, since it also has tons of debugging features, too.

For 5 and 6 it might be. I have a 3 in my server at home and run purely in-tree driver on it (stock Ubuntu 21.04 kernel). Rock solid for weeks and months, easily reaching high full speed on both ports in iperf3 (and SR-IOV also working beautifully)

geerlingguy commented 3 years ago

Note that there is an I2C interface on the card... I'm considering attaching my USB to UART adapter to it and seeing what it outputs.

geerlingguy commented 3 years ago

@albydnc - I'll try with https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed in a bit. I blacklisted the mlx4_core driver in /etc/modprobe.d/.

geerlingguy commented 3 years ago
pi@raspberrypi:~/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64 $ sudo ./mlnxofedinstall --add-kernel-support
Current operation system is not supported (raspbian10)!
Doridian commented 3 years ago
pi@raspberrypi:~/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64 $ sudo ./mlnxofedinstall --add-kernel-support
Current operation system is not supported (raspbian10)!

I managed to bypass that one with --skip-distro-check I believe? But then I got hit with weird compile errors on the module itself. When I googled them the errors from the log it was just "yeah your kernel version isn't supported", which I'm not sure is the right answer, but I couldn't find any other.

geerlingguy commented 3 years ago
pi@raspberrypi:~/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64 $ sudo ./mlnxofedinstall --add-kernel-support --distro debian10.0
/home/pi/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64/mlnx_add_kernel_support.sh: line 225: rpm: command not found
Note: This program will create MLNX_OFED_LINUX TGZ for debian10.0 under /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+ directory.
See log file /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+/mlnx_iso.772_logs/mlnx_ofed_iso.772.log

Checking if all needed packages are installed...
ERROR: 'createrepo' is not installed!
'createrepo' package is needed for creating a repository from MLNX_OFED_LINUX RPMs.
Use '--skip-repo' flag if you are not going to set MLNX_OFED_LINUX as repository for
installation using yum/zypper tools.

Failed to build MLNX_OFED_LINUX for 5.10.39-v8+
geerlingguy commented 3 years ago

Tried sudo apt-get install -y createrepo and tried again:

pi@raspberrypi:~/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64 $ sudo ./mlnxofedinstall --add-kernel-support --distro debian10.0
Note: This program will create MLNX_OFED_LINUX TGZ for debian10.0 under /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+ directory.
See log file /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+/mlnx_iso.1566_logs/mlnx_ofed_iso.1566.log

Checking if all needed packages are installed...
Building MLNX_OFED_LINUX RPMS . Please wait...

ERROR: Failed executing "MLNX_OFED_SRC-4.9-3.1.5.0/install.pl --tmpdir /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+/mlnx_iso.1566_logs --kernel-only --kernel 5.10.39-v8+ --kernel-sources /lib/modules/5.10.39-v8+/build --builddir /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+/mlnx_iso.1566 --disable-kmp --without-debug-symbols --build-only --distro debian10.0"
ERROR: See /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.39-v8+/mlnx_iso.1566_logs/mlnx_ofed_iso.1566.log
Failed to build MLNX_OFED_LINUX for 5.10.39-v8+

That log:

Checking SW Requirements...
One or more required packages for installing OFED-internal are missing.
/lib/modules/5.10.39-v8+/build/scripts is required for the Installation.
Attempting to install the following missing packages:
automake dpatch dh-autoreconf libltdl-dev autoconf bzip2 pkg-config gcc quilt debhelper make chrpath linux-headers-5.10.39-v8+ swig autotools-dev build-essential graphviz m4 dkms
Failed command: apt-get install -y automake dpatch dh-autoreconf libltdl-dev autoconf bzip2 pkg-config gcc quilt debhelper make chrpath linux-headers-5.10.39-v8+ swig autotools-dev build-essential graphviz m4 dkms

Going to re-flash 64-bit Pi OS, update everything, install the pi kernel headers directly, and try again.

geerlingguy commented 3 years ago

Reflashed and ran:

sudo apt-get update
sudo apt-get dist-upgrade -y
sudo apt-get install -y raspberrypi-kernel-headers
sudo reboot

Then after reboot:

sudo apt-get install -y createrepo
sudo ./mlnxofedinstall --add-kernel-support --distro debian10.0

Waiting...

pi@raspberrypi:~/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64 $ sudo ./mlnxofedinstall --add-kernel-support --distro debian10.0
Note: This program will create MLNX_OFED_LINUX TGZ for debian10.0 under /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+ directory.
See log file /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs/mlnx_ofed_iso.1164.log

Checking if all needed packages are installed...
Building MLNX_OFED_LINUX DEBS . Please wait...

ERROR: Failed executing "MLNX_OFED_SRC-4.9-3.1.5.0/install.pl --tmpdir /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs --kernel-only --kernel 5.10.17-v8+ --kernel-sources /lib/modules/5.10.17-v8+/build --builddir /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164 --without-dkms --without-debug-symbols --build-only --distro debian10.0"
ERROR: See /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs/mlnx_ofed_iso.1164.log
Failed to build MLNX_OFED_LINUX for 5.10.17-v8+
geerlingguy commented 3 years ago

Log contents:

pi@raspberrypi:~/MLNX_OFED_LINUX-4.9-3.1.5.0-debian10.0-aarch64 $ cat /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs/mlnx_ofed_iso.1164.log
Logs dir: /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs/OFED.1377.logs
General log file: /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs/OFED.1377.logs/general.log

Below is the list of OFED packages that you have chosen
(some may have been added by the installer due to package dependencies):

ofed-scripts
mlnx-ofed-kernel-utils
mlnx-ofed-kernel-modules
rshim-modules
iser-modules
isert-modules
srp-modules
mlnx-nvme-modules
kernel-mft-modules
knem-modules

Checking SW Requirements...
One or more required packages for installing OFED-internal are missing.
Attempting to install the following missing packages:
pkg-config swig dh-autoreconf libltdl-dev chrpath autotools-dev build-essential dpatch bzip2 automake make debhelper autoconf gcc lsof m4 graphviz quilt
This program will install the OFED package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with OFED, do not reinstall them.

Installing new packages
Building DEB for ofed-scripts-4.9 (ofed-scripts)...
Running  /usr/bin/dpkg-buildpackage -us -uc 
Building DEB for mlnx-ofed-kernel-utils-4.9 (mlnx-ofed-kernel)...
Running  /usr/bin/dpkg-buildpackage -us -uc 
Failed to build mlnx-ofed-kernel DEB
Collecting debug info...
See /tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164_logs/OFED.1377.logs/mlnx-ofed-kernel.debbuild.log

And that log has in it:

/usr/src/linux-headers-5.10.17-v8+
checking for Linux sources... /usr/src/linux-headers-5.10.17-v8+
checking for /usr/src/linux-headers-5.10.17-v8+... yes
checking for Linux objects dir... /usr/src/linux-headers-5.10.17-v8+
checking for /boot/kernel.h... no
checking for /var/adm/running-kernel.h... no
checking for /usr/src/linux-headers-5.10.17-v8+/.config... yes
checking for /usr/src/linux-headers-5.10.17-v8+/include/generated/autoconf.h... yes
checking for /usr/src/linux-headers-5.10.17-v8+/include/linux/kconfig.h... yes
checking for build ARCH... ARCH=, SRCARCH=arm64
checking for cross compilation... no
checking for external module build target... configure: error: kernel module make failed; check config.log for details

Failed executing ./configuremake[1]: *** [debian/rules:62: override_dh_auto_configure] Error 1
make[1]: Leaving directory '/tmp/MLNX_OFED_LINUX-4.9-3.1.5.0-5.10.17-v8+/mlnx_iso.1164/mlnx-ofed-kernel/mlnx-ofed-kernel-4.9'
make: *** [debian/rules:50: build] Error 2
dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
geerlingguy commented 3 years ago

As of MLNX_OFED version v5.1-0.6.6.0, the following are no longer supported.

  • ConnectX-3
  • ConnectX-3 Pro
  • Connect-IB
  • RDMA experimental verbs libraries (mlnx_lib)

Users who wish to utilize the above devices/libraries are advised to refer to MLNX_OFED 4.9 long-term support (LTS) version.

I think you mentioned that... but it seems like the 4.9 LTS version—does that only work with Linux kernels < 5.x?

It looks like 5.1-0.6.6.0 is the first version to not support X-3, whereas 5.0-2.1.8.0 does. Going to try that version instead, downloading the tgz from https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed.

geerlingguy commented 3 years ago

Next attempt with newer version 5.0-2.1.8.0:

pi@raspberrypi:~/MLNX_OFED_LINUX-5.0-2.1.8.0-debian10.0-aarch64 $ sudo ./mlnxofedinstall --add-kernel-support --distro debian10.0
Note: This program will create MLNX_OFED_LINUX TGZ for debian10.0 under /tmp/MLNX_OFED_LINUX-5.0-2.1.8.0-5.10.17-v8+ directory.
See log file /tmp/MLNX_OFED_LINUX-5.0-2.1.8.0-5.10.17-v8+/mlnx_iso.7527_logs/mlnx_ofed_iso.7527.log

Checking if all needed packages are installed...
Building MLNX_OFED_LINUX DEBS . Please wait...

ERROR: Failed executing "MLNX_OFED_SRC-5.0-2.1.8.0/install.pl --tmpdir /tmp/MLNX_OFED_LINUX-5.0-2.1.8.0-5.10.17-v8+/mlnx_iso.7527_logs --kernel-only --kernel 5.10.17-v8+ --kernel-sources /lib/modules/5.10.17-v8+/build --builddir /tmp/MLNX_OFED_LINUX-5.0-2.1.8.0-5.10.17-v8+/mlnx_iso.7527 --without-dkms --without-debug-symbols --build-only --distro debian10.0"
ERROR: See /tmp/MLNX_OFED_LINUX-5.0-2.1.8.0-5.10.17-v8+/mlnx_iso.7527_logs/mlnx_ofed_iso.7527.log
Failed to build MLNX_OFED_LINUX for 5.10.17-v8+

Same issue in that log:

checking for build ARCH... ARCH=, SRCARCH=arm64
checking for cross compilation... no
checking for external module build target... configure: error: kernel module make failed; check config.log for details

Failed executing ./configuremake[1]: *** [debian/rules:62: override_dh_auto_configure] Error 1
make[1]: Leaving directory '/tmp/MLNX_OFED_LINUX-5.0-2.1.8.0-5.10.17-v8+/mlnx_iso.7527/mlnx-ofed-kernel/mlnx-ofed-kernel-5.0'
make: *** [debian/rules:50: build] Error 2
dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
elmeyer commented 3 years ago

checking for external module build target... configure: error: kernel module make failed; check config.log for details

What does config.log say?

geerlingguy commented 3 years ago

@elmeyer - I couldn't find a config.log anywhere :(

elmeyer commented 3 years ago

Ah, shame. Perhaps it‘s bothered by the unset ARCH, since that’s what it prints right before it fails? What happens if you set it to the same as SRCARCH, i.e. arm64?

albydnc commented 3 years ago

@geerlingguy https://docs.mellanox.com/display/MLNXOFEDv492260/General+Support+in+MLNX_OFED 4.9 support kernel 5.4 in ubuntu 20.04, you can also install all the prerequisites in this page.

albydnc commented 3 years ago

I will try to compile it on centos 8 stream for raspberry pi, maybe is more compatible

albydnc commented 3 years ago

A little report on what I was able to do: I have tried the compilation on many OSes here are the results:

I was able to run the OFED 4.9 LTS installation script by doing the simple sudo ./mlnxofedinstall and it took ages to compile kernel dkms, like 3 hours on the pi 4 4 GB. Since I don't have a CM4 board with mellanox cards, now it is up to you to test it. If you like, I can share an .img file of my sd.

Doridian commented 3 years ago

A little report on what I was able to do: I have tried the compilation on many OSes here are the results:

  • Centos 7 no OFED version available
  • Centos 8 issue with missing kernel-headers
  • Debian 10 _mlnx_ofed_kerneldkms failed to compile, similar issue to Raspbian
  • Ubuntu Server 20.04.2 LTS 64 bit success!!!

I was able to run the OFED 4.9 LTS installation script by doing the simple sudo ./mlnxofedinstall and it took ages to compile kernel dkms, like 3 hours on the pi 4 4 GB. Since I don't have a CM4 board with mellanox cards, now it is up to you to test it. If you like, I can share an .img file of my sd.

I will gladly take the .img file and see if it works. I doubt it will work since I assume the driver issue is "something the driver does that RPi's PCIe doesn't like", but, its just a single flash so shouldn't take long.

albydnc commented 3 years ago

I will post it tomorrow, since now I don't have a good internet connection and it will take the whole week to upload.

albydnc commented 3 years ago

@Doridian here you can find it. I shrunk it with PiShrink, so it should automatically resize the root partition. https://drive.google.com/file/d/1phLx6rIjUNvNPQbLEu0ZRHznssP-8pi1/view?usp=sharing

Doridian commented 3 years ago

@Doridian here you can find it. I shrunk it with PiShrink, so it should automatically resize the root partition. https://drive.google.com/file/d/1phLx6rIjUNvNPQbLEu0ZRHznssP-8pi1/view?usp=sharing

I got it booted, but I can't login with either pi/raspberry or ubuntu/ubuntu, what is the username/password for this image to SSH in? (I can probably change it in a hacky way, but that is a bit annoying to do)

Doridian commented 3 years ago

Well, turns out that isn't even necessary. With the image you provided my RPi CM4 won't even boot with the ConnectX-3 installed. All I get is a flood of this (when it tries to load the driver I assume) WIN_20210604_08_16_17_Pro Verified with my own image again, and I am back to the old behaviour (links detected, RX packets work, TX do not)

geerlingguy commented 3 years ago

Marking this as done... can't find any way to get the thing working, unfortunately.

justinclift commented 1 year ago

Just stumbled over this issue report, which is interesting from my work (years ago) doing a bunch of SysAdmin stuff with Mellanox cards on Linux.

Looking at the lspci output near the top of this issue, it has:

  LnkCap:  Port #8, Speed 8GT/s, Width x4, ...

and:

  LnkSta:  Speed 5GT/s, Width x1, ...

The LnkCap output means "Link Capabilities". eg what this device is theoretically capable of doing over a PCIe connection. 8GT/s means PCIe v3, and Width x4 means four PCIe lanes at that speed

The LnkSta means Link Status (currently). eg what the PCIe connection managed to negotiate. 5GT/s means PCIe v2, and Width x1 means a single PCIe lane

From memory, Mellanox cards require a link width of 4 or above to function, regardless of speed. eg Width x4. More than 4 would increase available bandwidth too (eg x8, x16)

It was fairly common to see a Mellanox card plugged into a PCIe slot that was running at (say) x4 instead of x8/x16, and the card wouldn't have the expected throughput. So, rearranging things to put the card in an x8 (or x16) slot would fix the problem. Or adjusting PCIe lane splitting in the BIOS to allocate more lanes to the slot (when possible).

Anyway, if there's a way to get 4 lanes allocated to the PCIe slot in question, you'd probably be good with these cards. Without that though, yeah... no joy.

darshankowlaser commented 8 months ago

has anyone tried to get this working with a raspberry pi 5?

justinclift commented 8 months ago

Doesn't the RPi5 only provide a single PCIe lane? aka x1

Mellanox cards (at least used to) require a minimum of 4 lanes to function, so it's super unlikely to work. Would be happy to be proven wrong. :smile:

geerlingguy commented 8 months ago

@justinclift - A number of ConnectX cards 'work'... though to various degrees. I've only tested them on CM4 so far, not yet on Pi 5.

justinclift commented 8 months ago

@geerlingguy When you say 'work' like that, are you meaning they pass traffic? Or are you meaning they just show up on the PCIe bus?

geerlingguy commented 8 months ago

@justinclift - The driver loads, the card seems to be functional (and ports light up), but when you try sending traffic you get errors like:

NETDEV WATCHDOG: eth1 (mlx4_core): transmit queue 1 timed out

The problem is Mellanox cards are ancient history and have been unsupported for so long... I don't have the funds to buy some newer cards, so I'm waiting for the price to come down a bit before I'll re-test.

I currently use Aqtion aqc107-based cards for Pis as they seem to work nicely out of the box with the kernel driver, and don't care about the bus size.

justinclift commented 8 months ago

The problem is Mellanox cards are ancient history ...

Are you meaning the ConnectX-3 series that Ebay has been awash with for years? There are heaps of those around because they're what's been used in industry for a very long time (HPC, storage networks, etc).

Prior to 10GbE (finally) becoming widespread and cheap enough for home labs in recent years, they were the only game in town.


The ConnectX-4 and above (5, 6, 7 series) are all still officially supported:

https://docs.nvidia.com/networking/display/mlnxenv23070500/release+notes

Looks like some of those series are now available on Ebay fairly cheaply too. eg:

Note - I don't know any of those sellers at all, that was just doing some quick searching on Ebay.


Searching for "Aqtion aqc107" on Ebay shows a few 10GbE cards at US$75-100. I'm not familiar with them though, so don't know of better search terms to find better priced options. They probably exist. :smile:

Those cards seem to have 10GbE Base-T connectors too, so cabling would be much simpler to figure out than the Mellanox cards. :grin:

disablewong commented 5 months ago

em......

Just stumbled over this issue report, which is interesting from my work (years ago) doing a bunch of SysAdmin stuff with Mellanox cards on Linux.

Looking at the lspci output near the top of this issue, it has:

  LnkCap:  Port #8, Speed 8GT/s, Width x4, ...

and:

  LnkSta:  Speed 5GT/s, Width x1, ...

The LnkCap output means "Link Capabilities". eg what this device is theoretically capable of doing over a PCIe connection. 8GT/s means PCIe v3, and Width x4 means four PCIe lanes at that speed

The LnkSta means Link Status (currently). eg what the PCIe connection managed to negotiate. 5GT/s means PCIe v2, and Width x1 means a single PCIe lane

From memory, Mellanox cards require a link width of 4 or above to function, regardless of speed. eg Width x4. More than 4 would increase available bandwidth too (eg x8, x16)

It was fairly common to see a Mellanox card plugged into a PCIe slot that was running at (say) x4 instead of x8/x16, and the card wouldn't have the expected throughput. So, rearranging things to put the card in an x8 (or x16) slot would fix the problem. Or adjusting PCIe lane splitting in the BIOS to allocate more lanes to the slot (when possible).

Anyway, if there's a way to get 4 lanes allocated to the PCIe slot in question, you'd probably be good with these cards. Without that though, yeah... no joy.

not quite raspberry pi but it's funny that I've got my cx-341 working with radxa rock 5b with only 2 pcie lanes.... however my current problem is it only work at 5GT/s maximum....

justinclift commented 5 months ago

Oh, that's super interesting. When you say it's working through, are you meaning it's just showing up on the PCIe bus or is it passing data over the network like it should (albeit slowly)?

disablewong commented 5 months ago

Oh, that's super interesting. When you say it's working through, are you meaning it's just showing up on the PCIe bus or is it passing data over the network like it should (albeit slowly)?

um..... actually, I've been using this card for around 1 year with nanopi neo4 (RK3399 platform) and recently with Rock 5B(RK3588 platform).

on neo 4 I was able to get around 300-500 MB/s read and write as samba server. on rock5b, it is around 700MB/s (limited by pcie gen2 x2 lanes after pcie bifurcation). with all x4 lanes i can get iperf3 at 9.7Gbps but then limited by the SATA SSD at 500MB/s....

You can find the discussion on my current set up and problem at here: https://forum.radxa.com/t/pcie-link-downgraded-with-10g-nic/19680/30

disablewong commented 5 months ago

in additions, I got two of this CX341A NIC and both of them are working. Other working cards including aqtion, HP(be2net), intel 82599 & X520

justinclift commented 5 months ago

Cool, that's really good info @disablewong. :smile:

disablewong commented 5 months ago

Cool, that's really good info @disablewong. 😄

Just a quick update, I am now able to use full 10Gbps with CX4411A and RK3588! :)

[ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.19 GBytes 1.02 Gbits/sec 0 sender [ 5] 0.00-10.00 sec 1.19 GBytes 1.02 Gbits/sec receiver [ 7] 0.00-10.00 sec 560 MBytes 470 Mbits/sec 0 sender [ 7] 0.00-10.00 sec 559 MBytes 469 Mbits/sec receiver [ 9] 0.00-10.00 sec 864 MBytes 725 Mbits/sec 0 sender [ 9] 0.00-10.00 sec 864 MBytes 725 Mbits/sec receiver [ 11] 0.00-10.00 sec 1.30 GBytes 1.12 Gbits/sec 0 sender [ 11] 0.00-10.00 sec 1.30 GBytes 1.12 Gbits/sec receiver [ 13] 0.00-10.00 sec 884 MBytes 741 Mbits/sec 0 sender [ 13] 0.00-10.00 sec 883 MBytes 741 Mbits/sec receiver [ 15] 0.00-10.00 sec 708 MBytes 594 Mbits/sec 0 sender [ 15] 0.00-10.00 sec 707 MBytes 593 Mbits/sec receiver [ 17] 0.00-10.00 sec 960 MBytes 805 Mbits/sec 0 sender [ 17] 0.00-10.00 sec 959 MBytes 805 Mbits/sec receiver [ 19] 0.00-10.00 sec 497 MBytes 417 Mbits/sec 0 sender [ 19] 0.00-10.00 sec 497 MBytes 417 Mbits/sec receiver [ 21] 0.00-10.00 sec 707 MBytes 593 Mbits/sec 0 sender [ 21] 0.00-10.00 sec 706 MBytes 593 Mbits/sec receiver [ 23] 0.00-10.00 sec 746 MBytes 626 Mbits/sec 0 sender [ 23] 0.00-10.00 sec 745 MBytes 625 Mbits/sec receiver [ 25] 0.00-10.00 sec 539 MBytes 452 Mbits/sec 0 sender [ 25] 0.00-10.00 sec 538 MBytes 452 Mbits/sec receiver [ 27] 0.00-10.00 sec 877 MBytes 735 Mbits/sec 0 sender [ 27] 0.00-10.00 sec 876 MBytes 735 Mbits/sec receiver [ 29] 0.00-10.00 sec 1.11 GBytes 954 Mbits/sec 0 sender [ 29] 0.00-10.00 sec 1.11 GBytes 953 Mbits/sec receiver [ 31] 0.00-10.00 sec 596 MBytes 500 Mbits/sec 0 sender [ 31] 0.00-10.00 sec 595 MBytes 499 Mbits/sec receiver [SUM] 0.00-10.00 sec 11.4 GBytes 9.75 Gbits/sec 0 sender [SUM] 0.00-10.00 sec 11.3 GBytes 9.75 Gbits/sec receiver