raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.16k stars 5k forks source link

Pi 4 Ethernet Network fails to connect to some 100Mb Switches #3122

Open noafterglow opened 5 years ago

noafterglow commented 5 years ago

We make a 100Mb switch which we used with Pi 2's and Pi3's... Its a "Hat", based on Realtek RTL8306MB. We have shipped thousands of them with Pi3's. They are bulletproof.

Pi4 will not connect to any port on this switch. The physical link will not establish. We can faintly see a little blink from the lower left LED on the socket, but no connectivity and no link lights.

I suspect this is an autonegotiation problem. It may be related to #3108 and #3121

Any other combination of switches aand Pi versions, including some other 100Mb switches not based on the RTL8306MB seem to work ok, but we have not stress tested the connection. Obviously we would like to not redesign the hat for our application, so this is an issue for us. I suspect it will also crop up a lot of other places.

kernel is 4.19.57.

If there are other registers we can dump to be helpful, I can do that.....

pi@raspberrypi:~ $ sudo mii-tool -vv eth0
Using SIOCGMIIPHY=0x8947
eth0: no link
  registers for MII PHY 1: 
    1140 7949 600d 84a2 0de1 0000 0066 0000
    0000 0300 0000 0000 0000 0000 0000 3000
    0000 0000 0000 00a1 0000 0000 0000 0000
    0400 7800 240e fff1 3403 0000 0000 0000
  product info: vendor 18:03:61, model 10 rev 2
  basic mode:   autonegotiation enabled
  basic status: no link
  capabilities: 1000baseT-HD 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-HD 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control

I can change the advertisement with no effect, but if I force it the connection is immediately up.

pi@raspberrypi:~ $ sudo mii-tool -R eth0
resetting the transceiver...
pi@raspberrypi:~ $ sudo mii-tool -F 100baseTx-FD eth0
pi@raspberrypi:~ $ sudo mii-tool -vv eth0
Using SIOCGMIIPHY=0x8947
eth0: 100 Mbit, full duplex, link ok
  registers for MII PHY 1: 
    2100 794d 600d 84a2 01e1 0000 0066 0000
    0000 0200 0000 0000 0000 0000 0000 3000
    0000 0000 0000 007b 0000 0000 0000 0000
    0400 7d04 040e fff1 3403 0000 0000 0000
  product info: vendor 18:03:61, model 10 rev 2
  basic mode:   100 Mbit, full duplex
  basic status: link ok
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
pi@raspberrypi:~ $ 
JamesH65 commented 5 years ago

Does the workaround described in this issue help ? https://github.com/raspberrypi/linux/issues/3108#issuecomment-518851625

noafterglow commented 5 years ago

I presume you are referring to the workaround with genet.force_reneg=y ???

NO.

It has no effect. The autonegotiation seems to quickly fail. it does not appear to be a "sometimes" thing.

I can help with hardware debugging, (have tools, but need guidance on what to look for.) Could not find much on the web since "ethernet" is a very common term.

pelwell commented 5 years ago

Have you updated to the latest rpi-update firmware? If dmesg | grep negot returns nothing then the workaround is not enabled.

ghost commented 5 years ago

I presume you are referring to the workaround with genet.force_reneg=y ???

That workaround currently requires you update the firmware & kernel using sudo rpi-update first, then reboot with that line in cmdline.txt, otherwise that option will have no effect.

noafterglow commented 5 years ago

After rpi-update, same... no effect, no connection. now running kernel 4.19.64 #1250

pelwell commented 5 years ago

And what does dmesg | grep negot return?

noafterglow commented 5 years ago

OK, so here's what happens. Case 1: plugged into 100Mb Realtek switch. After boot, it returns nothing. Case 2: plugged into the GS108 1Gb switch, it returns something like [ 11.599] bcmgenet ... Forcing Renegotiation

If I pull the cord on Case 1 after a while and plug into the GS108, like 3 minutes it ends up looking like case 2, except that the time is much larger... like 250 seconds.

pelwell commented 5 years ago

Thanks - that helps. This is clearly a different problem to that reported in #3108, although they may be related. The Forcing renegotiation message is displayed when the renegotiation occurs, which is at the point where the link comes up for the first time (although "up" may not be fully working). In the case of the Realtek switch it sounds like the connection never gets that far.

noafterglow commented 5 years ago

Yes. Is the phy programmable? Normally connection is pretty automatic with no SW involved. I couldn't find any usable data sheets... sigh...

On Thu, Aug 8, 2019, 04:00 Phil Elwell notifications@github.com wrote:

Thanks - that helps. This is clearly a different problem to that reported in #3108 https://github.com/raspberrypi/linux/issues/3108, although they may be related. The Forcing renegotiation message is displayed when the renegotiation occurs, which is at the point where the link comes up for the first time (although "up" may not be fully working). In the case of the Realtek switch it sounds like the connection never gets that far.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/3122?email_source=notifications&email_token=ABGXBQ545GWVFMIX6DN244TQDP4FLA5CNFSM4IJCCAN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD33IE7Q#issuecomment-519471742, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGXBQY4AI7HE7O43TQD7OTQDP4FLANCNFSM4IJCCANQ .

pelwell commented 5 years ago

The PHY is programmable. Watching the lights on the switch, there is a link very soon after power is applied, then when the kernel starts to boot the link gets reset (not as the result of my workaround) and auto-negotiation starts again. I'm wondering if preventing this reset would change the behaviour.

pelwell commented 5 years ago

Investigating the resets to prevent a re-negotiation led to an alternative workaround - skipping a reset step in the driver. See #3108 for more details.

pelwell commented 5 years ago

The latest rpi-update firmware includes the new workaround.

noafterglow commented 5 years ago

upgrading to 4.19.65 has no effect for me. The issue still remains. It appears that the problem is with the setup of the chip itself. As I mentioned before, I don't get a link ever unless I force the phy to 100Mb. I suspect that the timing of auto negotiation is off, and that is what is causing these problems. Unfortunately the Pi foundation picked another NDA required chip to use as a phy, so I can't debug it myself... Perhaps a call to Broadcom is in order?

JamesH65 commented 4 years ago

@noafterglow Is this issue still present on the very latest Raspbian/firmware?

noafterglow commented 4 years ago

I will check it out and let you know.

On Fri, Dec 20, 2019, 07:15 James Hughes notifications@github.com wrote:

@noafterglow https://github.com/noafterglow Is this issue still present on the very latest Raspbian/firmware?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/3122?email_source=notifications&email_token=ABGXBQ4TXKA6G56DMCMZ3ZDQZTOSFA5CNFSM4IJCCAN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHNF5SA#issuecomment-567959240, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGXBQ7P3PQ6GIGCVA4KO33QZTOSFANCNFSM4IJCCANQ .

noafterglow commented 4 years ago

Sorry for the delay guys. I can confirm the problem still exists. I updated to the latest kernel 5.4.51 #1325 and it still behaves as before. No ethernet connectivity without forcing with mii-tool -F 100baseTx-FD eth0

In addition, I can now confirm that Realtek RTL8367-CG chipsets also fail to connect. I'm waiting for more info on this issue from our CM in china.

Happy to provide hardware for the debugging effort, or to run whatever you like on the boards.

noafterglow commented 4 years ago

James,

Sorry for the delay. I updated the github case. Sadly, the problem still exists, and we have now found another chipset (this time 1Gb, rather than 100Mb) which also experiences the same issue.

Ihor Lys 617-470-2740

On Fri, Dec 20, 2019 at 7:50 AM ihor lys ihor.lys@gmail.com wrote:

I will check it out and let you know.

On Fri, Dec 20, 2019, 07:15 James Hughes notifications@github.com wrote:

@noafterglow https://github.com/noafterglow Is this issue still present on the very latest Raspbian/firmware?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/3122?email_source=notifications&email_token=ABGXBQ4TXKA6G56DMCMZ3ZDQZTOSFA5CNFSM4IJCCAN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHNF5SA#issuecomment-567959240, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGXBQ7P3PQ6GIGCVA4KO33QZTOSFANCNFSM4IJCCANQ .

noafterglow commented 4 years ago

Is there anything I can do to help this along? This appears to affect quite a few Realtek switch chips, though not all of them.

@JamesH65 ? @pelwell ? It seems there are other threads which also show these kinds of Pi 4 problems.

noafterglow commented 4 years ago

Found an interesting effect. when running mii-tool multiple times it occasionally reports the advertised capabilities of the remote peer, and correctly apparently. no link is established, but the PHY apparently does properly discover the remote peer, and apparently is able to report it to the MAC. Thus this may not be Phy timing issue, but rather some timing issue between the PHY and MAC.

here is the dump of 4 consecutive executions:

pi@raspberrypi:~ $ sudo mii-tool eth0 -vvvvvv
Using SIOCGMIIPHY=0x8947
eth0: no link
  registers for MII PHY 1: 
    1140 7949 600d 84a2 01e1 0000 0066 2001
    0000 0200 0000 0000 0000 0000 0000 3000
    1000 0021 0000 0027 0000 0180 0180 0000
    0400 7d00 040e fff1 34aa 0477 0000 0000
  product info: vendor 18:03:61, model 10 rev 2
  basic mode:   autonegotiation enabled
  basic status: no link
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
pi@raspberrypi:~ $ sudo mii-tool eth0 -vvvvvv
Using SIOCGMIIPHY=0x8947
eth0: no link
  registers for MII PHY 1: 
    1140 7949 600d 84a2 01e1 0000 0064 2001
    0000 0200 0000 0000 0000 0000 0000 3000
    1000 0001 0000 0000 0000 0180 0180 0000
    0400 1000 0000 fff1 34aa 0477 0000 0000
  product info: vendor 18:03:61, model 10 rev 2
  basic mode:   autonegotiation enabled
  basic status: no link
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
pi@raspberrypi:~ $ sudo mii-tool eth0 -vvvvvv
Using SIOCGMIIPHY=0x8947
eth0: no link
  registers for MII PHY 1: 
    1140 7949 600d 84a2 01e1 c5e1 006f 2001
    4002 0200 0000 0000 0000 0000 0000 3000
    1000 0001 0000 0000 0000 0180 0180 0000
    0400 3818 0400 fff1 34aa 04a8 0000 0000
  product info: vendor 18:03:61, model 10 rev 2
  basic mode:   autonegotiation enabled
  basic status: no link
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  **link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control**
pi@raspberrypi:~ $ sudo mii-tool eth0 -vvvvvv
Using SIOCGMIIPHY=0x8947
eth0: no link
  registers for MII PHY 1: 
    1140 7949 600d 84a2 01e1 0000 0066 2001
    0000 0200 0000 0000 0000 0000 0000 3000
    1000 0021 0000 0019 0000 0180 0180 0000
    0400 6d00 040e fff1 34aa 04a8 0000 0000
  product info: vendor 18:03:61, model 10 rev 2
  basic mode:   autonegotiation enabled
  basic status: no link
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
jayanta525 commented 3 years ago

Any updates?

Sven-v-Beuningen commented 3 years ago

I think I'm hit by exactly this issue. I build up a device with an internal 5-Port switch. I think it is similar to the hat you're talking about. On one side there are 4 ethernet sockets. To one of them I connected an arduino with an ethernet shield. This one does work fine. The other 3 sockets are connected to 3 Raspbery Pi 4b. None of them gets a link. Only the LEDs of the ethernet sockets on the PIs are flashing from time to time. Is there anything I can do to get this working?

noafterglow commented 3 years ago

While I have not confirmed this beyond a single unit, of a 1GB switch chip, it may be that the RPi's ethernet chip is either extremely sensitive to the switch's crystal frequency, or is itself calibrated very poorly. It would be nice if you could mess around with your XTAL loading capacitors and report back if you are able to get it working, and if so, what freq range actually works for you.

noafterglow commented 3 years ago

Root cause appears to be negotiation timing. I have now tested this on 2 peers with out of spec timing. This is apparently fairly common. Pi4 is apparently extremely sensitive to the timing during negotiation, so much so that it will fail to establish a link if the peer's crystal is out of spec. While I don't have a full analysis of how all of that happens, I can now report that you must have the peer's crystal in spec for the Pi4 to negotiate a connection. Also note that the crystal can be wildly out of spec, and the PI4 will happily communicate at either 100Mb or 1Gb speeds once told to do so with a manual connection setup. This suggests that timing will be an issue going forward on some PI4''s as crystals invariably age and drift.

kaneelschep commented 2 years ago

I ran into the same issue after updating from pi3 to pi4. Nothing changed here but the pi, and like you, no lan network. No light on on the port. The same pi4 works fine on all my other switches. A friends pi4 had exactly the same issue. The switch is a 3com 3cm-3cgsu08(b)

noafterglow commented 1 year ago

ADDITIONAL INFO: It seems that Energy Efficient Ethernet can cause problems with SOME Pi variants. We ran into this on the RTL8367N, which has this as a pin strapping option or an option which can be read in from EEPROM. It is possible that SOME switches using this chip will have EEE enabled. The symptoms on a Pi 4 Model B rev 1.2 were: Connection auto negotiates, then drops several times at 1Gb, then connection gets established at 100Mb, BUT under load it drops again over and over. There are several episodes of carrier lost in the log files.

distancerunner commented 1 year ago

I used an SD Card in an Pi Zero (bought in 2018). The System was a little old. Now I transfered the SD Card to an PI4 (the Keyboard Model, bought in 2023). The wifi and ETH0 was missing on the keyboard pi model:

reibuehl commented 1 year ago

I have what seems to be the same issue with an Edimax ES5500M switch. Older Raspberry Pis and other systems work fine with the switch but two RPi4 show the same behavior with no connectivity and no link lights. If I connect them to a UniFi switch, they work fine. I did run rpi-update and added genet.force_reneg=y to cmdline.txt but that did not help.