Open geerlingguy opened 3 years ago
Wow awesome thanks for considering this project. If possible try running OpenWRT and see if you can pump 2.5Gbps Routing via one port to another (LAN-WAN). Or rather how much bandwidth it can handle. If it can do 2.5Gbps (not sure where the bottleneck will be) then it means the pi can act as a 2.5Gbps router, something many people would love to use once the openwrt build for pi 4 is stable. For now you will have to try snapshots (though in my experience they are already very stable) with afaik a kmod already present for the Realtek controller.
Just a heads up TP-Link recently released world's first an 8 port 2.5gbe switch. I'm sure other manufacturers will follow suit now, about damn time. https://www.tp-link.com.cn/product_1775.html https://www.chiphell.com/thread-2284077-1-1.html
So you wouldn't need to re-wire everything. Your current wiring will 100% support 2.5gbe around the house. Don't need to convert everything to 10gbe, since SFP+ to RJ45 10gbe are expensive, just the main stuff like your main computer/nas etc.
@vegedb - Yeah; and I've noticed a few motherboard manufacturers have slowly been introducing built-in 2.5G ports. I'm hopeful in a matter of 3-5 years we'll see most 'low-end' gear go 2.5 Gbps so people can start getting better-than-1 Gbps performance on existing networks.
It seems like the chipsets are not that expensive and most of the time it's consuming the same 1x PCIe lane, so it's not a huge burden to switch.
@geerlingguy RP4 has 4Gbps shared across its USB3.0 ports. Have you tested USB 3.0 2.5gbe adapters? Would be more realistic for those that don't have the compute module.
@vegedb - Something like this CableCreation adapter might work with a Pi 4 model B, but I haven't tried one. In the case of this project, I'm testing different PCIe devices for two reasons:
@geerlingguy from your solution I think you may have helped solve the problem for the USB version too, unknowingly. https://www.raspberrypi.org/forums/viewtopic.php?t=278985
Can't be sure until you or someone else tests it on the US B versions.
@vegedb - Interesting! I just posted a follow-up comment in that forum topic, too.
@geerlingguy Nice! Hope someone follows up.
Btw regarding your iperf stopgap solution installed in merlin software, you could circumvent this by plugging your mikrotik 10gbps>RJ45 2.5G AX86U jack. You can't miss it, it only has 1* 2.5gbe port.
Doing this will enable you to do some real world tests between clients, like transferring large files.
Inserted card, booted Pi OS, and checked:
$ lspci
00:00.0 PCI bridge: Broadcom Limited Device 2711 (rev 20)
01:00.0 PCI bridge: ASMedia Technology Inc. Device 1182
02:03.0 PCI bridge: ASMedia Technology Inc. Device 1182
02:07.0 PCI bridge: ASMedia Technology Inc. Device 1182
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 8125
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 8125
And in dmesg:
[ 1.058583] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[ 1.060253] brcm-pcie fd500000.pcie: No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[ 1.062017] brcm-pcie fd500000.pcie: MEM 0x0600000000..0x0603ffffff -> 0x00f8000000
[ 1.063834] brcm-pcie fd500000.pcie: IB MEM 0x0000000000..0x00ffffffff -> 0x0100000000
[ 1.099518] brcm-pcie fd500000.pcie: link up, 5 GT/s x1 (SSC)
[ 1.100748] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00
[ 1.101732] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 1.102680] pci_bus 0000:00: root bus resource [mem 0x600000000-0x603ffffff] (bus address [0xf8000000-0xfbffffff])
[ 1.104584] pci 0000:00:00.0: [14e4:2711] type 01 class 0x060400
[ 1.105768] pci 0000:00:00.0: PME# supported from D0 D3hot
[ 1.110069] pci 0000:00:00.0: bridge configuration invalid ([bus ff-ff]), reconfiguring
[ 1.112107] pci 0000:01:00.0: [1b21:1182] type 01 class 0x060400
[ 1.113198] pci 0000:01:00.0: enabling Extended Tags
[ 1.114322] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[ 1.118526] pci 0000:01:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 1.120912] pci 0000:02:03.0: [1b21:1182] type 01 class 0x060400
[ 1.122062] pci 0000:02:03.0: enabling Extended Tags
[ 1.123179] pci 0000:02:03.0: PME# supported from D0 D3hot D3cold
[ 1.124744] pci 0000:02:07.0: [1b21:1182] type 01 class 0x060400
[ 1.125828] pci 0000:02:07.0: enabling Extended Tags
[ 1.126885] pci 0000:02:07.0: PME# supported from D0 D3hot D3cold
[ 1.130418] pci 0000:02:03.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 1.132134] pci 0000:02:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 1.134106] pci 0000:03:00.0: [10ec:8125] type 00 class 0x020000
[ 1.135113] pci 0000:03:00.0: reg 0x10: [io 0x0000-0x00ff]
[ 1.136078] pci 0000:03:00.0: reg 0x18: [mem 0x00000000-0x0000ffff 64bit]
[ 1.137856] pci 0000:03:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit]
[ 1.139659] pci 0000:03:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[ 1.141753] pci 0000:03:00.0: supports D1 D2
[ 1.142676] pci 0000:03:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 1.146839] pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
[ 1.148016] pci 0000:04:00.0: [10ec:8125] type 00 class 0x020000
[ 1.149054] pci 0000:04:00.0: reg 0x10: [io 0x0000-0x00ff]
[ 1.150045] pci 0000:04:00.0: reg 0x18: [mem 0x00000000-0x0000ffff 64bit]
[ 1.151876] pci 0000:04:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit]
[ 1.153788] pci 0000:04:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[ 1.156025] pci 0000:04:00.0: supports D1 D2
[ 1.157005] pci 0000:04:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 1.161200] pci_bus 0000:04: busn_res: [bus 04-ff] end is updated to 04
[ 1.162211] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 04
[ 1.163156] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 04
[ 1.164104] pci 0000:00:00.0: BAR 8: assigned [mem 0x600000000-0x6001fffff]
[ 1.165925] pci 0000:01:00.0: BAR 8: assigned [mem 0x600000000-0x6001fffff]
[ 1.167732] pci 0000:01:00.0: BAR 7: no space for [io size 0x2000]
[ 1.168666] pci 0000:01:00.0: BAR 7: failed to assign [io size 0x2000]
[ 1.169609] pci 0000:02:03.0: BAR 8: assigned [mem 0x600000000-0x6000fffff]
[ 1.171394] pci 0000:02:07.0: BAR 8: assigned [mem 0x600100000-0x6001fffff]
[ 1.173214] pci 0000:02:03.0: BAR 7: no space for [io size 0x1000]
[ 1.174249] pci 0000:02:03.0: BAR 7: failed to assign [io size 0x1000]
[ 1.175226] pci 0000:02:07.0: BAR 7: no space for [io size 0x1000]
[ 1.176177] pci 0000:02:07.0: BAR 7: failed to assign [io size 0x1000]
[ 1.177118] pci 0000:03:00.0: BAR 2: assigned [mem 0x600000000-0x60000ffff 64bit]
[ 1.178945] pci 0000:03:00.0: BAR 6: assigned [mem 0x600010000-0x60001ffff pref]
[ 1.180737] pci 0000:03:00.0: BAR 4: assigned [mem 0x600020000-0x600023fff 64bit]
[ 1.182646] pci 0000:03:00.0: BAR 0: no space for [io size 0x0100]
[ 1.183647] pci 0000:03:00.0: BAR 0: failed to assign [io size 0x0100]
[ 1.184670] pci 0000:02:03.0: PCI bridge to [bus 03]
[ 1.185696] pci 0000:02:03.0: bridge window [mem 0x600000000-0x6000fffff]
[ 1.187643] pci 0000:04:00.0: BAR 2: assigned [mem 0x600100000-0x60010ffff 64bit]
[ 1.189617] pci 0000:04:00.0: BAR 6: assigned [mem 0x600110000-0x60011ffff pref]
[ 1.191614] pci 0000:04:00.0: BAR 4: assigned [mem 0x600120000-0x600123fff 64bit]
[ 1.193667] pci 0000:04:00.0: BAR 0: no space for [io size 0x0100]
[ 1.194702] pci 0000:04:00.0: BAR 0: failed to assign [io size 0x0100]
[ 1.195720] pci 0000:02:07.0: PCI bridge to [bus 04]
[ 1.196740] pci 0000:02:07.0: bridge window [mem 0x600100000-0x6001fffff]
[ 1.198778] pci 0000:01:00.0: PCI bridge to [bus 02-04]
[ 1.199835] pci 0000:01:00.0: bridge window [mem 0x600000000-0x6001fffff]
[ 1.201876] pci 0000:00:00.0: PCI bridge to [bus 01-04]
[ 1.202902] pci 0000:00:00.0: bridge window [mem 0x600000000-0x6001fffff]
Recompiling on 5.10.y with Device Drivers > Network device support > Ethernet driver support > Realtek 8169/8168/8101/8125 ethernet support
as in #40.
Nice!
3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether 00:13:3b:b0:30:9c brd ff:ff:ff:ff:ff:ff
4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether 00:13:3b:b0:30:9d brd ff:ff:ff:ff:ff:ff
$ sudo ethtool eth1
Settings for eth1:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Port: MII
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: d
Link detected: no
$ sudo ethtool eth2
Settings for eth2:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
2500baseT/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Port: MII
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: d
Link detected: no
Next up, benchmarking. After that, figuring out OpenWRT.
Next up, benchmarking. After that, figuring out OpenWRT.
OpenWRT building is very friendly. A good idea to include Luci, Luci-app-sqm and the correct kmod for realtek to make configuring simple using GUI. Use a Ubuntu server VM cause generally its recommended to use only a single core while building openwrt sometimes I get errors while using multiple cores.
They all have IPs!
$ ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether b8:27:eb:5c:89:43 brd ff:ff:ff:ff:ff:ff
inet 10.0.100.120/24 brd 10.0.100.255 scope global dynamic noprefixroute eth0
valid_lft 86087sec preferred_lft 75287sec
inet6 fe80::3e45:6bd1:11b9:d24/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:13:3b:b0:30:9c brd ff:ff:ff:ff:ff:ff
inet 10.0.100.218/24 brd 10.0.100.255 scope global dynamic noprefixroute eth1
valid_lft 86094sec preferred_lft 75294sec
inet6 fe80::a052:42c2:e806:d265/64 scope link
valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:13:3b:b0:30:9d brd ff:ff:ff:ff:ff:ff
inet 10.0.100.219/24 brd 10.0.100.255 scope global dynamic noprefixroute eth2
valid_lft 86097sec preferred_lft 75297sec
inet6 fe80::3573:ea16:20b1:c7f2/64 scope link
valid_lft forever preferred_lft forever
5: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether b8:27:eb:74:f2:6c brd ff:ff:ff:ff:ff:ff
I edited dhcpcd.conf
with sudo nano /etc/dhcpcd.conf
and added:
interface eth1
static ip_address=192.168.0.10/24
static ip6_address=fd51:42f8:caae:d92e::ff/64
static routers=10.0.100.1
static domain_name_servers=10.0.100.1
Then on my Mac, I added a second interface with the same hardware so I could add a separate IP address on it (see MacMiniWorld's article).
But... that doesn't seem to be working—so I might have to pull out one of the Mellanox cards and set up my PC desktop as the 2nd network endpoint for iperf3
.
So benchmarking is... fun. I can't find a good way to get two separate interfaces on the Pi routing traffic to one network card on my Mac, so I now have a janky setup where I have a DAC running between my Windows PC (which was in the middle of being reworked a bit for some testing anyways) with a Mellanox ConnectX-2 card in it, plus a port going to my Mac's TB3 adapter using 10GBASE-T, and then the other two MikroTik ports going to the 2x 2.5 GbE connections on the Pi.
So... with that sorted, I can confirm I can get 5.56 Gbps between my Mac and the Windows 10 PC (totally unoptimized—I'm using Windows' built-in driver because I had trouble installing the older Mellanox driver on Windows 10 Home (and there could be a varity of reasons for that).
So next step is to make it so one of the interfaces is on 192.168.x.x and the other on 10.0.100.x, and each one can reach either my Mac or the Windows 10 PC, and then run iperf3 in server mode on each of those two.
Sheesh.
On my Mac, in Terminal:
iperf3 -s --bind 10.0.100.144
On the Raspberry Pi, in two separate SSH sessions:
# First make sure the internal network interface is down.
sudo ip link set eth0 down
# Run iperf3 server on 192.168.0.10.
iperf3 -s --bind 192.168.0.10
# To Mac over main network DHCP.
iperf3 -c 10.0.100.144
On Windows, in Powershell:
# To Raspberry Pi with manual IP address.
./iperf3.exe -c 192.168.0.10
Results:
[ 5] 0.00-10.00 sec 1.75 GBytes 1.50 Gbits/sec receiver
and
[ 5] 0.00-10.04 sec 813 MBytes 680 Mbits/sec receiver
Total: 2.18 Gbps (without jumbo frames) across two interfaces. Not that impressive yet.
Jumbo frames enabled between Mac and Pi (but on Windows I got wildly inconsistent results with Jumbo Frame in the advanced settings set to 9014 or any higher, so I kept it at 1514):
# Pi to Mac:
[ 5] 0.00-10.01 sec 2.88 GBytes 2.47 Gbits/sec receiver
# Windows to Pi:
[ 5] 0.00-10.04 sec 681 MBytes 569 Mbits/sec receiver
Total: 3.04 Gbps
For some reason, I can't get the PC to give very consistent performance—and I don't know if it's the Mellanox ConnectX-2 card, the driver, the DAC cable (my fiber cable and transceivers would not light up the port), or what, so I'm guessing I could eke out another 100-200 Mbps since the IRQs are not a bottleneck according to atop
:
This setup is horrifically annoying as I now have a spiderweb of cables through my office (I've only tripped on them once), so I don't think I'll keep it set up like this for benchmarking purposes. Suffice it to say, you're not getting much more than 3.1-3.2 Gbps through both 2.5 Gbps interfaces at once, so I'll put a pin in that benchmarking task for now.
Next up is to see what I can do with OpenWRT...
I couldn't leave well-enough alone. I downloaded the Windows WinOF Driver—which states it works with ConnectX-3, but in fact does also work with ConnectX-2—for Windows 10 64-bit, and finally got it installed.
With that driver in place, I still couldn't get stable Jumbo Frame support, but I did see more stable speeds overall, especially with a private static IP, which for some reason would cause Windows' own driver to barf sometimes.
And the results?
Total: 3.195 Gbps (with Jumbo Frames on only one of the two interfaces)
I was only ever able to get 3.220 Gbps total across 4x 1 Gbps ports on the Intel I340-T4, so I'm going to say ~3.20 Gbps is right around the upper limit of total network bandwidth you can get on the Pi for most network cards with more than one network interface.
Indeed, even the straight 10 Gbps ASUS card can only punch through to 3.26 Gbps (see preliminary results in https://github.com/geerlingguy/raspberry-pi-pcie-devices/issues/15).
So yeah. The Pi's not going to go beyond about 4.1-4.2 total Gbps of network throughput (onboard NIC included), and I think after testing like 8 different networking scenarios, I can say that with 99% confidence (outside some nutso building some I2C network interface and pumping through a couple more megabits!).
Got my hands on this NIC today. Unfortunately, I can't recommend it due to the fact that it uses an old version of RTL8125 controller, which misses RSS, HW RX hashing and multiple TX queues support. It means that driver couldn't leverage multiple A72 cores that RPi has. All the interrupts of both controllers are tied to Core #0 and Linux can't distribute RX flow processing between different cores efficiently. I'm gonna try to get IOCREST 2.5Gbps NIC, which is based on RTL8125B and should have all the features supported. Still, I'm not sure that RPi interrupt controller will allow transferring IRQ processing to different cores...
Anyway, I've noticed couple of things that might improve performance:
Attaching patch that would allow using Realtek driver under openwrt. Don't forget to enable it in make menuconfig and disable the kernel one. rtl8125_openwrt.zip
@dmitriarekhta - After a ton of back-and-forth discussing the IRQ affinity issues for the Intel i340, we found out that it is impossible to spread interrupts over multiple cores on the Pi, so that's always going to be a limiting factor.
That's why unless the hardware does support some of the more advanced features, you can't saturate the Pi's ~3.4 Gbps PCIe lane using network packets unless you use jumbo frames :(
The 10G ASUS adapter I tested seems to be able to support higher speeds with normal frames.
@dmitriarekhta - Hmm, after re-reading your comment—were you able to get affinity spread out over all four cores? If so, that sounds like it would be a huge boost for performance.
@geerlingguy did you ever try this dual 2.5GB card using OpenWrt?
I've been using wolfy's OpenWrt built on rpi4 paired with a USB3 Realtek 1GB NIC with great success for routing 1GB at wire speed. I had to manually assign CPU affinity for the NICs which was trivial on OpenWrt. The CPU cores sit at ~35% during heavy routing workloads.
My hope is that the Sybia Dual 2.5GB card can route around 1.5GB, that's the link my ISP provides via a cable modem (with 2.5GB port) and Redshirt Dan wants all of the bandwidth that I'm paying for :-)
@That-Dude - I have not yet, but still have it in my list of 'projects I want to try out'.
@That-Dude - I have not yet, but still have it in my list of 'projects I want to try out'.
Cool I'll keep an eye out. Love your YouTube channel man.
great stuff! I'm eagerly awaiting any additional information!
I am using the card on the rpi 5 with openwrt and it works great with 1 gig fiber internet, It would be great if someone with faster internet could test it. I tested locally with openspeedtest and during the test it transferred from WAN to LAM 170 GiB without any problem.
root@openwrt:/# lspci
0000:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries Device 2712 (rev 21)
0000:01:00.0 PCI bridge: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
0000:02:03.0 PCI bridge: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
0000:02:07.0 PCI bridge: ASMedia Technology Inc. ASM1182e 2-Port PCIe x1 Gen2 Packet Switch
0000:03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
0000:04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries Device 2712 (rev 21)
0001:01:00.0 Ethernet controller: Device 1de4:0001
@leobsky - If you have another computer on the network that's wired with > 1 Gbps connection, you can try running iperf3 -s
on it, then iperf3 -c [ip address of that other computer]
on the Pi to test the bandwidth on your LAN.
Well, if one is good (see #40), two is surely better, right?
Originally inspired by chinmaythosar's comment, then bolstered by ServeTheHome's review, and finally encouraged by the ease of testing (besides accidentally blowing the magic smoke out of the first card I bought) in #40, I've decided to buy a new Syba Dual 2.5 Gbps Ethernet PCIe NIC, which has not one but two Realtek RTL8125's. On top of that, it has a PCIe switch built-in, the ASMedia ASM1182e.
So it would be neat to see if this card works out of the box with the same ease as the Rosewill card I tested without a PCIe switch and with only one port.
If it does work, it will be interesting to see how many bits I can pump through (especially testing overclock and jumbo frames)—can it match the 4.15 Gbps performance of the Intel I340-T4 + internal Gigabit interface? I'm guessing not, at least not by itself, because it seems there's a hard limit on the PCIe bus that's reached well before the 5 Gbps PCIe gen 2 x1 lane limit.