Microchip-Ethernet / EVB-KSZ9477

Repository for using Microchip EVB-KSZ9477 board. Product Supported: KSZ9477, KSZ9567, KSZ9897, KSZ9896, KSZ8567, KSZ8565, KSZ9893, KSZ9563, KSZ8563, LAN9646, Phys(KSZ9031/9131, LAN8770
76 stars 79 forks source link

KSZ8567 + IMX6ULL - packet drop at 100Mbps #100

Open AaronElijah opened 9 months ago

AaronElijah commented 9 months ago

(This will be a very long post as I wanted to provide as much detail as possible - if I have made mistakes in my understanding, feel free to point them out)

Problem:

We are trying to develop a board that uses the IMX6ULL as the CPU and KSZ8567 as the switch chip. The IMX6ULL Ethernet interface is connected to port 6 on the KSZ8567 in RMII mode. Port 6 is configured in RMII Normal Mode (i.e. The 50 MHz RMII REFCLK is received at the RXC pin and the IMX6 outputs a 50Mhz clock - we've confirmed this with an oscilloscope).

The issue we're having is when we set up a Linux bridge including all the switch ports, I expect that two hosts connected on the switch ports could successfully ping each other at 100Mbps full duplex and an iperf test between them would succeed with relatively high bitrate (90+ Mbps). However, when doing this, the ICMP packets are often dropped and an iperf test connects but fails to transfer any packets at any meaningful speed.

Hardware breakdown

Given that we are using an IMX6ULL on our board, which only has an Ethernet controller for 10/100, Port 6 cannot be set at 1Gbit RGMII as is commonly seen in KSZ switch designs (see snippet of device tree file).

&fec1 {
    pinctrl-names = "default";
    pinctrl-0 = <&pinctrl_enet1>;
    phy-mode = "rmii";
    status = "okay";

    fixed-link {
        speed = <100>;
        full-duplex;
    };
};

&i2c1 {
    clock-frequency = <100000>;
    pinctrl-names = "default";
    pinctrl-0 = <&pinctrl_i2c1>;
    status = "okay";

    switch@5f {
        compatible = "microchip,ksz8567";
        reg = <0x5f>;

        ports {
            #addess-cells = <1>;
            #size-cells = <0>;

            port@0 {
                reg = <0x00>;
                label = "lan1";
            };

            port@1 {
                reg = <0x01>;
                label = "lan2";
            };

            port@2 {
                reg = <0x02>;
                label = "lan3";
            };

            port@3 {
                reg = <0x03>;
                label = "lan4";
            };

            port@4 {
                reg = <0x04>;
                label = "lan5";
            };

            port@5 {
                reg = <0x05>;
                label = "cpu";
                ethernet = <&fec1>;
                phy-mode = "rmii";

                fixed-link {
                    speed = <100>;
                    full-duplex;
                };
            };
        };
    };
};

As far as I know, the KSZ8567 is equivalent to the KSZ9567 but is non-gigabit. The KSZ9567 is supported in the Linux kernel (6.1.x) DSA driver. Note that we are using linux-imx, which is nominally very similar concerning microchip DSA driver to the mainline Linux kernel 6.1 release. Hence adding support for the KSZ8567 is (I think) the case of mostly copying the implementation for KSZ9567 but without Gbit capability (I've appended a patch file for a minimum implementation). 0001-fix-Add-KSZ8567-support-in-Microchip-KSZ-NET-DSA-dev.patch

Testing Devices used during tests: 1) First, 2x Raspberry Pi CM4 on Raspberry Pi IO boards 2) Second, 1x Orange PI CM4 on Orange Pi IO board + 1x Orange Pi Zero3 1Gb

The first part of the test was setting up the software bridge on our board where each port on the KSZ8567 would be a slave device. I used the following commands.

ip link set eth0 up
ip link set lan5 up
ip link set lan4 up
ip link set lan3 up
ip link set lan2 up
ip link set lan1 up
ip link add name br0 type bridge vlan_filtering 1
ip link set dev lan5 master br0 
ip link set dev lan4 master br0 
ip link set dev lan3 master br0 
ip link set dev lan2 master br0 
ip link set dev lan1 master br0 
ip addr add 10.1.1.100/24 dev br0
ip link set dev br0 up 

Note that this is similar to Microchip's KSZ DSA Application Note.

Then we connected two devices (first 2xRaspberry Pi, then later 2xOrange Pi devices) to one of the slave ports, gave them static IPs in address range 10.1.1.0-10.1.1.255, turned off EEE and allowed auto-negotiation. Both slave ports auto-negotiated to 100Mbps, full duplex successfully (confirmed from ethtool {ethernet interface}).

Here is a readout of what bridge fdb shows on the IMX6ULL. Screenshot 2023-10-05 at 16 30 59

When running a ping test between the two devices, we found that there was an occasional packet drop.

Screenshot 2023-10-05 at 16 46 18 From the screenshot above, you can see an ICMP echo request (seq 41) is sent by one device and not received by the other device.

Furthermore, when we perform a TCP iperf test, it connects but the transfer is 0Mbps. Screenshot 2023-10-05 at 16 13 11

When we do a UDP iperf test, there is packet transfer but with a significant packet drop - almost 40%. Screenshot 2023-10-05 at 16 12 27

We can see from a tcpdump of each device's ethernet interface during these tests that ICMP echo requests and replies are being successfully forwarded (excluding the packet drop mentioned earlier). Also, from running tcpdump on the IMX6ULL ethernet interface connected to the KSZ8567 host port, we can't see any ICMP echo request/reply packets forwarded to the host port, leading me to believe that there is hardware offloading occurring in the KSZ8567 switch. Correct me if I'm wrong here.

Note that when we turned off auto-negotiation and set the speed to 10mbps full duplex, the ping and iperf tests seemed to work fine (i.e. we get 9.6Mpbs speed at 10Mbps, full duplex).

So the question is how to get 100Mbps working properly without the huge amounts of packet drop. The device is pretty much unusable at 100Mbps without it.

Things we've tried

1) It is documented in KSZ8567 errata that EEE so we disabled that in all link partners and still noticed this packet drop at 100Mbps.

2) We also wondered if it was related to TCP packet segmentation. Whilst we believe we are seeing hardware FDB offloading, maybe there was something going on where packets were being segmented anyway (?) (problematic for packets on the host port where the packet is tail tagged by the IMX6). However upon trying this off with

ethtool –K eth0 tso off

and trying the 100mbps test, we noticed no difference.

3) Turning off auto-negotiation and force setting speed to 100Mbps, full duplex. No difference.

4) We wondered if this was due to the fact that the RMII backbone on the host port to the IMX6 may be the issue. We have noted that many implementations of boards typically use RGMII as the interface to the host CPU. Could this be related to the issue? We don't think it should be because hardware offloading ought to be taking place for most of the packets regardless AND it's worth noting that this wouldn't alleviate congestion issues for the KSZ9567 (which has Gbit slave ports and a Gbit host port).

5) Tried setting static FDB entries prior to connecting the devices on the ports. Perhaps the bridge wasn't doing hardware offloading correctly(?). See image below

Screenshot 2023-09-27 at 17 22 17

Again, this made no difference in the 100Mbps test.

I appreciate this is a very long post so thank you for reading if you've made it this far! If anyone at Microchip has ideas on what could be causing this issue (or I've made an error in hardware/software design) please let me know!