frank-w / BPI-Router-Linux

Linux kernel 4.14+ for BPI-R2, 5.4+ for R64, 6.1+ for R2Pro and R3
Other
136 stars 47 forks source link

SFP interface does not work on you 6.9-rc branch #118

Open jcdutton opened 8 months ago

jcdutton commented 8 months ago

Board: BPI-R3 sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn REDACTED dc REDACTED ethtool looks like something it connected, but the link is always down: 3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether 7a:3b:38:39:40:41 brd ff:ff:ff:ff:ff:ff root@bpi-r3:/etc/hostapd# ethtool eth1 Settings for eth1: Supported ports: [ MII ] Supported link modes: 2500baseX/Full 2500baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 2500baseX/Full 2500baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 2500Mb/s Duplex: Full Auto-negotiation: off Port: MII PHYAD: 0 Transceiver: internal Current message level: 0x000000ff (255) drv probe link timer ifdown ifup rx_err tx_err Link detected: no

By comparison. The SFP works on openwrt. Also, commands take a long time to execute, so something is blocking.

frank-w commented 8 months ago

Make sure the pcs driver (lynxi on r3/r4,xfi on r4 for 10g) is selected.

jcdutton commented 8 months ago

The lynxi is selected. But no link is established.

jcdutton commented 8 months ago

The attached patch fixes the SFP for me. sfp-fix1.diff.txt

frank-w commented 8 months ago

So you use the oem 2g5 sfp i also have?

Your patch basicly reverts erics patch right?

https://github.com/frank-w/BPI-Router-Linux/commit/dd3dd754cdc7cb1e61f6588eaa6b8ee15896b42b

Have you added realtek phy driver from him?

jcdutton commented 8 months ago

I don't think my transceiver uses a realtek phy. What command do I use to identify the phy?

jcdutton commented 8 months ago

The SFP transceiver presents I2C interface on 0x50, 0x51, 0x56. So, it looks to me that it supports multiple different protocols over I2C. For example, 0x51 has this: CNS8TUTAAC30-1410-04V04 and GLC-T which looks strangely similar to: Cisco GLC-T 1000BASE-T SFP Module - 30-1410-04 CNS8TUTAAC

I don't think it supports MDIO. I think this confuses the sfp.c probing, that results in it choosing the swphy. A software emulation for a phy.

jcdutton commented 8 months ago
i2cdump -y 0 0x56
No size specified (using byte-data access)
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: 10 00 79 49 4f 51 ea 19 11 e1 00 00 00 64 20 01    ?.yIOQ????...d ?
10: 00 00 02 00 00 00 00 00 00 00 00 00 00 00 20 00    ..?........... .
20: 00 62 00 00 00 00 00 00 08 2c 00 00 00 00 00 00    .b......?,......
30: 00 00 00 00 00 00 00 00 00 00 00 00 a0 00 00 00    ............?...
40: 09 40 41 c9 4f 51 ea 19 00 20 00 00 00 00 00 00    ?@A?OQ??. ......
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 00    ..............?.
60: 00 00 20 02 00 04 00 00 70 28 00 00 00 00 00 00    .. ?.?..p(......
70: 00 00 00 00 00 00 00 00 00 00 00 00 30 fc 80 01    ............0???
80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
90: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................

Whereas someone else saw this: At offset 4 it has: 0x001cc849 , which means that we have this PHY at 0x56:

        PHY_ID_MATCH_EXACT(0x001cc849),
        .name           = "RTL8221B-VB-CG 2.5Gbps PHY",

I see offset 4 has: 0x4f51ea19 instead of 0x001cc849 so I need to find with PHY 0x4f51ea19 is.

frank-w commented 8 months ago

You should post this to bpi-forum where eric can take a look at it. I'm no phy specialist :p but 0x56 looks like rollball protocol and good you can access this address...sometimes it is 0xff only

You have quoted eric above from here: https://forum.banana-pi.org/t/sfp-oem-sfp-2-5g-t-kernel-phy/15872/13?u=frank-w

jcdutton commented 8 months ago

I tried with the rollball and I added some debug. It comes back saying rollball not found.

jcdutton commented 8 months ago

@frank-w Apparently the 0x4f51ea19 is a motorcomm YT8821 PHY.

frank-w commented 7 months ago

Have you deleted the post with the mii patch?

jcdutton commented 7 months ago

Yes, I did delete it. It did not actually work. I am working on a different patch. Just trying to work out where exactly the *2 needs to happen in the code between C22 reg and i2c messages.

jcdutton commented 7 months ago

Well, on the positive side, I have the yt8821 phy working using C22. Notice the added "<< 1" to do the reg * 2. On the negative side, I have not worked out how to do the force auto neg off, so that the interface passes traffic. The remote side is seeing link up.

static int i2c_mii_read_default_c22(struct mii_bus *bus, int phy_id, int reg)
{
        return i2c_mii_read_default_c45(bus, phy_id, -1, reg << 1);
}

static int i2c_mii_write_default_c22(struct mii_bus *bus, int phy_id, int reg,
                                     u16 val)
{
        return i2c_mii_write_default_c45(bus, phy_id, -1, reg << 1, val);
}

Here is the output from the C22:

ethtool sfp1                                                                                                                                                            
Settings for sfp1:                                                                                                                                                                     
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full     
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 2500Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: Twisted Pair
        PHYAD: 22
        Transceiver: external                                                              
        MDI-X: Unknown
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: no                       
jcdutton commented 7 months ago

In C22 mode, the motorcomm yt8821 phy is setting link up in phydev->link = 1. I have not yet found out how to propagate that to the ethtool "Link detected" status. Does anyone have an idea how?

ericwoud commented 7 months ago

I2c is byte oriented, mii is word oriented, so this is where the difference comes from.

jcdutton commented 7 months ago

I have it almost fixed. The PHY link is up, the MAC link is down. For ethtool to show a "Link detected" both have to be up. Does anyone know when the MAC link should be brought up and how to do that? Which function to call where? I guess the options are: 1) When a user does "ifconfig sfp1 up" or similar the MAC link could come up. The the MAC link being up would be ready for the PHY link coming up. 2) The MAC link stays down, saving power, until the PHY link is up, and then activates the MAC link to match.

In: drivers/net/phy/phylink.c: phylink_resolve()
if (pl->phydev)
         link_state.link &= pl->phy_state.link;
pl->phy_state.link == 1      <--- The PHY state
link_state.link == 0             <--- The MAC state.
ericwoud commented 7 months ago

It probably doesn't come up, because 2500base-x inband is not supported by mac.

jcdutton commented 7 months ago

I just need the MAC to come up at 2500, irrespective of what the PHY is doing. The PHY does the rate adjustment.

jcdutton commented 7 months ago

Is there an IRC or similar channel we can chat on. It might be quicker to resolve this?

jcdutton commented 7 months ago

I2c is byte oriented, mii is word oriented, so this is where the difference comes from.

So is this a bug? Will putting the "reg << 1" in the code, fix C22 mode for multiple Transceivers? Thus not needing rollball any more? Can anyone test with other transceivers. I only have one make of transceiver here. I put this in the fixup to force it to try C22: sfp->mdio_protocol = MDIO_I2C_MARVELL_C22; sfp->id.base.extended_cc = SFF8024_ECC_2_5GBASE_T;

frank-w commented 7 months ago

i'm not on irc

i had added erics patches where the last disables inband for 2.5G, maybe you only need a quirk setting the 2.5g for your SFP like we have done for other SFPs

ericwoud commented 7 months ago

Will putting the "reg << 1" in the code, fix C22 mode for multiple Transceivers? Thus not needing rollball any more?

Cannot add << 1, it will break every C22 phy out there. Other modules will still need RollBall protocol. This module, I'll have to take a closer look, what's really going on.

jcdutton commented 7 months ago

@ericwoud Just to clarify. Say we wish to read the ID from device 0x56 using C22. (as is done in phy_device.c: get_phy_c22_id()

define MII_PHYSID1 0x02

define MII_PHYSID2 0x03

The read of 0x02 needs to be converted into reading offset 0x04 (2 octets) from i2c device 0x56. The read of 0x03 needs to be converted into reading offset 0x06 (2 octets) from i2c device 0x56.

So, a conversion from 0x02 to 0x04 has to happen somewhere. Where should that happen?

jcdutton commented 7 months ago

I have not managed to coax the MAC to turn on its link while the yt8821 PHY has a link. So, even with the transceiver reporting link up and link speed, the MAC is not turning on so no traffic passes. Its a bit like wack a mole. The PHY sends the netif_carrier_on() to make the link up, and the MAC phy_link code turns it off again because the MAC is not UP. For traffic to pass both MAC and PHY links need to be up. Is there a command line to use to force the MAC to be UP ?

frank-w commented 7 months ago

Have you tried the quirk approach to set the speed to 2500baseT for the sfp?

jcdutton commented 7 months ago

I have tried various quirks without luck. Please suggest a code snippet of the quirk you are suggesting

frank-w commented 7 months ago

similar to this, but with your vendor/product

https://github.com/frank-w/BPI-Router-Linux/commit/e27aca3760c08b7b05aea71068bd609aa93e7b35

can you pls post part of your dmesg where sfp is recognized for vendor-string and/or output of ethtool -m ethX

jcdutton commented 7 months ago

Hi. With that e27aca3 the sfp passes ethernet traffic, but no SFP PHY is detected because no PHY is even looked for.

If I change the quirks for force a C22 PHY, it detects the yt8821 phy and uses it, but the MAC is disabled so no traffic flows.

I guess I am looking for a quirk that forces "pl->cur_link_an_mode=0x1" instead of "pl->cur_link_an_mode=0x2" I.e. "Fixed" instead of "in-band".

ericwoud commented 7 months ago

guess I am looking for a quirk that forces "pl->cur_link_an_mode=0x1" instead of "pl->cur_link_an_mode=0x2" I.e. "Fixed" instead of "in-band".

I already pointed you the patch that does that in the forum

And you want mode phy instead of fixed to replace inband

jcdutton commented 7 months ago

I have diagnosed the problem. When the yt8821 is being used, it tries to configure the mac interface type from "0x17:2500base-x" to "0x4:sgmii" and this stops the MAC working, as it wishes to be left at "0x17:2500base-x". I think the code in phylink.c is a little confusing. It does not keep state of the PHY separate from the state of the MAC, and thus when the PHY changes, it tries to set the MAC to match it, and this obviously fails. Note, of course, maybe SFP transceivers wish the PHY to match the MAC, but that will not work with the BPI-R3 and the particular transceiver I have. In my case, the PHY is in a different interface mode, and the MAC should be fixed at 2500base-x, no matter what the PHY is doing. This is probably a lot of work to do in the phylink.c in order to keep separate the state of the PHY and the MAC, and also keep details of what the MAC should do depending on the PHY. Note: In order to prove the above, I put in some code around whenever the MAC is asked to change, that if it was a particular interface name, suppress the changing of the MAC rate away from the 0x17 value. Its not a good fix, but proved the point. Here is an example of the problem code. It is happily mixing PHY and MAC interface state:

        if (pl->cur_link_an_mode != mode ||
            pl->link_config.interface != state->interface) {
                pl->cur_link_an_mode = mode;
                pl->link_config.interface = state->interface;
                pl->phy_state.interface = state->interface;

                changed = true;
                phylink_info(pl, "switched to %s/%s link mode\n",
                             phylink_an_mode_str(mode),
                             phy_modes(state->interface));
        }

        if (changed && !test_bit(PHYLINK_DISABLE_STOPPED,
                                 &pl->phylink_disable_state))
                phylink_mac_initial_config(pl, false);

Now, there are some other things I need to fix with regards to the yt8821 code as it does not appear to be reading available and advertised link speeds correctly from the PHY, but that is for another day.

ericwoud commented 7 months ago
ethtool sfp1                                                                                                                                                            
Settings for sfp1:                                                                                                                                                                     
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full

There is no 2500base-x here, So phylink will not use it.

You need to set the 2500base-x or 2500-base-t bit here in .get_features() of phy driver. Try which of the 2 is really needed

jcdutton commented 7 months ago

Hi. I think the point I am trying to make is this YT8821 transceiver should not need any quirks. Its data sheet says:

YT8821 Key features (PHY):
Supports clause 22 and clause 45 MDC/MDIO management interface
Supports rate adaptor for SERDES
Selectable SERDES interface to MAC control(SGMII /2500BASE-X)
Supports 3.3/1.8V signaling for MDIO access

From this the MTK MAC does: SERDES interface to MAC control only does (2500BASE-X)

So, the MAC needs to adjust the MAC-PHY link to what is compatible with the PHY, in this case fix it at 2500BASE-X and reject SGMII. (I don't think the Mediatek MAC can do SGMII, please correct me if I am wrong) A separate handling of the PHY-Cable negotiation, capabilities needs to happen. So, the Linux kernel should auto discover it without needing quirks, and handle the state of the PHY-Cable link and the MAC-PHY link separately and not mix them up.

For example, the Linux kernel, on detection of a speed change on the PHY-Cable link, will force the MAC-PHY link to go down. It makes no sense.

So far, I have got C22 working, but I have not managed to get C45 working, even though the YT8821 datasheet says it supports it. It supports C45 type commands and gets responses, but the responses do not make much sense yet so I think the C45 requests are crafted wrongly over the i2c-algo-bit.

jcdutton commented 7 months ago

By way of comparison,

The RTL 8221 key features (PHY):
Supports clause 22 and Clause 45 MDC/MDIO management interface
Supports rate adaptor for SerDes
Selectable SERDES interface to MAC control (SGMII/HiSGMII/2500 BASE-X)
Selectable 3.3/1.8V signaling for MMD access

So, the RTL 8821 PHY should not need to talk ROLLBALL, it should be able to do C22 or C45, just like the YT8821.

ericwoud commented 7 months ago

So, the MAC needs to adjust the MAC-PHY link to what is compatible with the PHY, in this case fix it at 2500BASE-X and reject SGMII. (I don't think the Mediatek MAC can do SGMII, please correct me if I am wrong)

To start with yes, but when it all works, I'll look into serdes switching according to linkspeed. I see in the code we both found from the internet that is has auto switching capabilities.

Mediatek MAC can do SGMII.

If you find a way to eliminate the use of rollball protocol, then you are very welcome. But honestly, if it was this simple, someone would have found out already...

jcdutton commented 7 months ago

Observations: Regarding 0x56. I have 2 transceivers. On one it has values in "i2cdump -y 2 0x56 i", the other has all 0xff in "i2cdump -y 2 0x56 i". But, if I do an "ifconfig sfp2 up" the "i2cdump -y 2 0x56 i" starts outputting normal values. If I do "ifconfig sfp2 down" the "i2cdump -y 2 0x56 i" is back to all 0xff again. I also noted that the differences between ifconfig up and down makes no i2c messages to cause transition between normal values and all 0xff. So, something else is causing access to 0x56 page to work or not. I don't know what the something else is, but I thought this observation might help other people.

jcdutton commented 7 months ago

Its is not a final version, but my first draft at the yt8821 support is attached. add-yt8821-phy.diff.txt