geerlingguy / arm-nas

Arm NAS configuration with ZFS.
GNU General Public License v3.0
123 stars 6 forks source link

Upgrade to 25 Gbps Ethernet #16

Closed geerlingguy closed 3 weeks ago

geerlingguy commented 3 weeks ago

I purchased a PCIe Gen 4 SFP28 NIC with Intel E810-XXVAM2 on Amazon, and would like to install it in the server to get dual 25 Gbps Ethernet on the NAS.

Some of my other gear is starting to come online at 25G, and it would be nice to have a storage target capable of saturating the network!

Intel has a driver download page here: Intel® Network Adapter Driver for E810 Series Devices under Linux*

geerlingguy commented 3 weeks ago

Interestingly, the SOL console in the BMC is spitting errors like:

[   11.232990] Unable to handle kernel paging request at virtual address 0021a817ce8721ad
[   11.240897] Mem abort info:
[   11.243680]   ESR = 0x96000004
[   11.246733]   EC = 0x25: DABT (current EL), IL = 32 bits
[   11.252039]   SET = 0, FnV = 0
[   11.255082]   EA = 0, S1PTW = 0
[   11.258212] Data abort info:
[   11.261080]   ISV = 0, ISS = 0x00000004
[   11.264905]   CM = 0, WnR = 0
[   11.267862] [0021a817ce8721ad] address between user and kernel address ranges
[   11.274986] Internal error: Oops: 96000004 [#1] SMP
[   11.279852] Modules linked in: ast(+) drm_vram_helper ttm drm_kms_helper crct10dif_ce syscopyarea ghash_ce sysfillrect sysimgblt sha2_ce fb_sys_fops sha256_arm64 sha1_ce mpt3sas(+) drm nvme(+) ixgbe(+) raid_class igb(+) ice(+) nvme_core xfrm_algo scsi_transport_sas mdio i2c_algo_bit aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[   11.310860] CPU: 0 PID: 205 Comm: kworker/0:2 Not tainted 5.4.0-198-generic #218-Ubuntu
[   11.318850] Hardware name: To Be Filled By O.E.M. ALTRAD8UD-1L2T/ALTRAD8UD-1L2T, BIOS 1.21 11/15/2023
...
[   11.427088] Call trace:
[   11.429522]  __kmalloc+0xac/0x2d0
[   11.432825]  rh_call_control+0x210/0x938
[   11.436735]  usb_hcd_submit_urb+0x14c/0x3e8
[   11.440906]  usb_submit_urb+0x198/0x590
[   11.444730]  usb_start_wait_urb+0x70/0x160
[   11.448814]  usb_control_msg+0xc4/0x140

This seems to happen after the ASPEED USB port tries initializing?

Another trace:

[   11.979019] ice 0004:01:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.4.0
[   11.989174] Unable to handle kernel paging request at virtual address 0021a817ce8721ad
[   11.997080] Mem abort info:
[   11.999862]   ESR = 0x96000004
[   12.002906]   EC = 0x25: DABT (current EL), IL = 32 bits
[   12.008206]   SET = 0, FnV = 0
[   12.011250]   EA = 0, S1PTW = 0
[   12.014379] Data abort info:
[   12.017247]   ISV = 0, ISS = 0x00000004
[   12.021072]   CM = 0, WnR = 0
[   12.024028] [0021a817ce8721ad] address between user and kernel address ranges
[   12.031152] Internal error: Oops: 96000004 [#2] SMP
[   12.036016] Modules linked in: hid_generic usbhid hid ast(+) drm_vram_helper ttm drm_kms_helper crct10dif_ce syscopyarea ghash_ce sysfillrect sysimgblt sha2_ce fb_sys_fops sha256_arm64 sha1_ce mpt3sas(+) drm nvme(+) ixgbe(+) raid_class igb(+) ice(+) nvme_core xfrm_algo scsi_transport_sas mdio i2c_algo_bit aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
[   12.069019] CPU: 0 PID: 13 Comm: kworker/0:1 Tainted: G      D           5.4.0-198-generic #218-Ubuntu
[   12.078311] Hardware name: To Be Filled By O.E.M. ALTRAD8UD-1L2T/ALTRAD8UD-1L2T, BIOS 1.21 11/15/2023
[   12.087518] Workqueue: events work_for_cpu_fn
[   12.091862] pstate: a0c00009 (NzCv daif +PAN +UAO)
[   12.096641] pc : kmem_cache_alloc_trace+0x94/0x278
[   12.101418] lr : kmem_cache_alloc_trace+0x6c/0x278
[   12.106195] sp : ffff800010253b20
[   12.109497] x29: ffff800010253b20 x28: 0000000000000000 
[   12.114796] x27: ffffaebc2bed51cc x26: 0000000000000068 
[   12.120094] x25: ffff680e28007c00 x24: ffffaebc2bed51cc 
[   12.125393] x23: 0000000000028a97 x22: 0000000000000dc0 
[   12.130691] x21: 0000000000000000 x20: ae21a817ce8721ad 
[   12.135990] x19: ffff680e28007c00 x18: ffffaebc2d108538 
[   12.141288] x17: 0000000088e09f7b x16: ffffaebc2c49baf0 
[   12.146586] x15: ffff680e28688530 x14: ffff800010c9f000 
[   12.151885] x13: ffff680e28a0fe00 x12: ffff800010bb5000 
[   12.157183] x11: ffffaebc2d8f43a0 x10: ffff800010bb0000 
[   12.162482] x9 : 0000000000000041 x8 : 0000000000004000 
[   12.167780] x7 : ffffaebc2ddf2818 x6 : ffff680e2815b428 
[   12.173079] x5 : ffffaebc2c460670 x4 : ffff680e2f9f91e0 
[   12.178377] x3 : 0000000000100070 x2 : ae21a817ce8721ad 
[   12.183676] x1 : 0000000000000000 x0 : 5197a916cf8535d2 
[   12.188974] Call trace:
[   12.191408]  kmem_cache_alloc_trace+0x94/0x278
[   12.195840]  alloc_msi_entry+0x3c/0x98
[   12.199578]  __pci_enable_msix_range.part.0+0x3a4/0x5b0
[   12.204790]  __pci_enable_msix_range+0x64/0x90
[   12.209221]  pci_enable_msix_range+0x48/0x58
[   12.213487]  ice_probe+0x6a4/0xc68 [ice]
[   12.217398]  local_pci_probe+0x48/0xa0
[   12.221135]  work_for_cpu_fn+0x24/0x38
[   12.224871]  process_one_work+0x1d0/0x498
[   12.228868]  worker_thread+0x238/0x528
[   12.232604]  kthread+0xf0/0x118
[   12.235733]  ret_from_fork+0x10/0x18
[   12.239296] Code: 54000e20 b9402261 f940ba60 8b010282 (f8616a81) 
[   12.245377] ---[ end trace 4029d97195803760 ]---

And then the system won't continue booting.

geerlingguy commented 3 weeks ago

I think I'm running Ubuntu 20.04 on the HL15... it might be worth attempting upgrading to 24.04 :O

Otherwise maybe I can manually install later Intel drivers?

geerlingguy commented 3 weeks ago

On Ampere's recommendation, I'm going to try a ConnectX-5 Mellanox card, the MCX512A-ACAT, instead.

Now I have a spare E810, ready to go into one of my Windows PCs :)

geerlingguy commented 3 weeks ago

I have the X-5 installed, and it seems to enumerate correctly:

jgeerling@nas01:~$ dmesg | grep mlx5
[   10.642917] mlx5_core 0004:01:00.0: Adding to iommu group 29
[   10.643196] mlx5_core 0004:01:00.0: enabling device (0100 -> 0102)
[   10.643316] mlx5_core 0004:01:00.0: firmware version: 16.27.2048
[   10.643346] mlx5_core 0004:01:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[   11.071754] mlx5_core 0004:01:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
[   11.085231] mlx5_core 0004:01:00.0: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384)
[   11.097374] mlx5_core 0004:01:00.0: Port module event: module 0, Cable plugged
[   11.097626] mlx5_core 0004:01:00.0: mlx5_pcie_event:294:(pid 542): PCIe slot advertised sufficient power (75W).
[   11.108877] mlx5_core 0004:01:00.1: Adding to iommu group 31
[   11.121246] mlx5_core 0004:01:00.1: enabling device (0100 -> 0102)
[   11.138453] mlx5_core 0004:01:00.1: firmware version: 16.27.2048
[   11.144495] mlx5_core 0004:01:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[   11.451304] mlx5_core 0004:01:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
[   11.460281] mlx5_core 0004:01:00.1: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384)
[   11.484789] mlx5_core 0004:01:00.1: Port module event: module 1, Cable unplugged
[   11.492484] mlx5_core 0004:01:00.1: mlx5_pcie_event:294:(pid 545): PCIe slot advertised sufficient power (75W).
[   11.516633] mlx5_core 0004:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[   11.787574] mlx5_core 0004:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[   54.125946] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0
[   54.145352] mlx5_core 0004:01:00.0 enP4p1s0f0: renamed from eth0
[   54.234030] mlx5_core 0004:01:00.1 enP4p1s0f1: renamed from eth1

It's not getting an IP address automatically, though. Not sure why.

geerlingguy commented 3 weeks ago

Not detecting a link...

jgeerling@nas01:~$ ethtool enP4p1s0f1
Settings for enP4p1s0f1:
    Supported ports: [ Backplane ]
    Supported link modes:   1000baseKX/Full 
                            10000baseKR/Full 
                            25000baseCR/Full 
                            25000baseKR/Full 
                            25000baseSR/Full 
    Supported pause frame use: Symmetric
    Supports auto-negotiation: Yes
    Supported FEC modes: None BaseR RS
    Advertised link modes:  1000baseKX/Full 
                            10000baseKR/Full 
                            25000baseCR/Full 
                            25000baseKR/Full 
                            25000baseSR/Full 
    Advertised pause frame use: Symmetric
    Advertised auto-negotiation: Yes
    Advertised FEC modes: None
    Speed: Unknown!
    Duplex: Unknown! (255)
    Port: Direct Attach Copper
    PHYAD: 0
    Transceiver: internal
    Auto-negotiation: on
Cannot get wake-on-lan settings: Operation not permitted
    Current message level: 0x00000004 (4)
                   link
    Link detected: no

I'm using a 10Gtek 25G SFP28 DAC - 3m, 30AWG, Passive... I wonder if this DAC isn't able to work with the card? Weird.

When I plug in the DAC, I see the changes:

Supported FEC modes: None BaseR RS  # was 'Not Reported'
Advertised FEC modes: None  # was 'Not Reported'
Port: Direct Attach Copper  # was 'Other'

But it still says Link detected: no.

On the switch (Mikrotik 25G), I'm seeing the link as negotiated at 25G:

Screenshot 2024-11-01 at 9 52 54 AM
geerlingguy commented 3 weeks ago

Strangely, at some point this morning, it looks like the Intel interfaces were giving a bunch of errors:

[49658.032378] pcieport 0003:00:03.0: AER: Corrected error message received from 0003:03:00.0
[49658.032388] ixgbe 0003:03:00.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[49658.042484] ixgbe 0003:03:00.0: AER:   device [8086:1563] error status/mask=00001000/00002000
[49658.051003] ixgbe 0003:03:00.0: AER:    [12] Timeout 

And the Mellanox driver is detecting cable hotplugs:

[58773.924177] mlx5_core 0004:01:00.0: Port module event: module 0, Cable unplugged
[58783.083281] mlx5_core 0004:01:00.1: Port module event: module 1, Cable plugged

Since this is 20.04, and I don't have NetworkManager present (so no nmcli), I ran:

sudo ip link set enP4p1s0f1 down
sudo ip link set enP4p1s0f1 up

And dmesg shows:

[59410.830394] mlx5_core 0004:01:00.1 enP4p1s0f1: Link up
[59410.833804] IPv6: ADDRCONF(NETDEV_CHANGE): enP4p1s0f1: link becomes ready

While ip a shows:

4: enP4p1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 6c:b3:11:29:4d:43 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::6eb3:11ff:fe29:4d43/64 scope link 
       valid_lft forever preferred_lft forever

So now it's getting IPv6, but not IPv4...

geerlingguy commented 3 weeks ago

Also grabbing hardware details with sudo lshw -C network:

  *-network:0 DISABLED
       description: Ethernet interface
       product: MT27800 Family [ConnectX-5]
       vendor: Mellanox Technologies
       physical id: 0
       bus info: pci@0004:01:00.0
       logical name: enP4p1s0f0
       version: 00
       serial: 6c:b3:11:29:4d:42
       capacity: 25Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress vpd msix pm bus_master cap_list ethernet physical 1000bt-fd 10000bt-fd 25000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=mlx5_core firmware=16.27.2048 (MT_0000000080) latency=0 link=no multicast=yes
       resources: iomemory:28000-27fff irq:89 memory:280000000000-280001ffffff memory:280004000000-2800047fffff
  *-network:1
       description: Ethernet interface
       product: MT27800 Family [ConnectX-5]
       vendor: Mellanox Technologies
       physical id: 0.1
       bus info: pci@0004:01:00.1
       logical name: enP4p1s0f1
       version: 00
       serial: 6c:b3:11:29:4d:43
       capacity: 25Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress vpd msix pm bus_master cap_list ethernet physical 1000bt-fd 10000bt-fd 25000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=mlx5_core duplex=full firmware=16.27.2048 (MT_0000000080) latency=0 link=yes multicast=yes
       resources: iomemory:28000-27fff irq:260 memory:280002000000-280003ffffff memory:280004800000-280004ffffff
geerlingguy commented 3 weeks ago

Huh. Forcing a release/renew grabbed an IP for the interface:

sudo dhclient -r enP4p1s0f1
sudo dhclient enP4p1s0f1

jgeerling@nas01:~$ ip a
...
4: enP4p1s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 6c:b3:11:29:4d:43 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.236/24 brd 10.0.2.255 scope global dynamic enP4p1s0f1
       valid_lft 7199sec preferred_lft 7199sec
    inet6 fe80::6eb3:11ff:fe29:4d43/64 scope link 
       valid_lft forever preferred_lft forever

Now the question is, will the configuration persist across a reboot?

geerlingguy commented 3 weeks ago

Nope. But following this Stack Exchange answer, I did the following to make the new card's Ethernet interfaces persist with IPv4 DHCP across reboots:

$ sudo nano /etc/netplan/00-installer-config.yaml

# Add in the interfaces among the others and save:
    enP4p1s0f0:
      dhcp4: true
    enP4p1s0f1:
      dhcp4: true

$ sudo netplan apply
$ sudo dhclient -r enP4p1s0f1
$ sudo dhclient enP4p1s0f1

And now even after a reboot, I'm getting full 25 Gbps bandwidth, yay!

jgeerling@nas01:~$ sudo ethtool enP4p1s0f1
Settings for enP4p1s0f1:
    Supported ports: [ Backplane ]
    Supported link modes:   1000baseKX/Full 
                            10000baseKR/Full 
                            25000baseCR/Full 
                            25000baseKR/Full 
                            25000baseSR/Full 
    Supported pause frame use: Symmetric
    Supports auto-negotiation: Yes
    Supported FEC modes: None BaseR RS
    Advertised link modes:  1000baseKX/Full 
                            10000baseKR/Full 
                            25000baseCR/Full 
                            25000baseKR/Full 
                            25000baseSR/Full 
    Advertised pause frame use: Symmetric
    Advertised auto-negotiation: Yes
    Advertised FEC modes: None
    Speed: 25000Mb/s
    Duplex: Full
    Port: Direct Attach Copper
    PHYAD: 0
    Transceiver: internal
    Auto-negotiation: on
    Supports Wake-on: d
    Wake-on: d
    Current message level: 0x00000004 (4)
                   link
    Link detected: yes

Full docs on Ubuntu's docs site: Configuring networks

I guess the 00-installer-config.yaml is created at system install time, and since this card wasn't present, it doesn't show up there. Ah well. I could create 99-mellanox.yaml and tack it on that way, but as this hardware change is likely permanent(ish), I'm happy just throwing the config in the installer.