raspberrypi / rpi-eeprom

Installation scripts and binaries for the Raspberry Pi 4 and Raspberry Pi 5 bootloader EEPROMs
https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#raspberry-pi-boot-eeprom
Other
1.27k stars 203 forks source link

Network (Re)Boot doesn't work anymore with Linux 5.16+ #417

Closed mripard closed 2 years ago

mripard commented 2 years ago

Describe the bug

This issue is a regression following the introduction of commit c3a4c69360ab43560f212eed326c9d8bde35b14c in Linux 5.16.

With that commit and if we reboot, the bootloader seems to wait for the ethernet link forever, thus breaking any network boot setup. With a power cycle (or a revert of the commit above), everything works fine.

Steps to reproduce the behaviour

Device (s)

Raspberry Pi 4 Mod. B

Bootloader configuration.

[all]
BOOT_UART=0
WAKE_ON_GPIO=1
POWER_OFF_ON_HALT=0
BOOT_ORDER=0xf21

System

pi@raspberrypi:~$ cat /etc/rpi-issue 
Raspberry Pi reference 2021-10-30
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 288b21fc27e128ea6b330777aca68e0061ebf4fe, stage4
pi@raspberrypi:~$ vcgencmd bootloader_version 
Apr 29 2021 17:11:25
version c2f8c388c4ee37ad709ace403467d163e8dd91ce (release)
timestamp 1619712685
update-time 1627302434
capabilities 0x0000001f
pi@raspberrypi:~$ vcgencmd version 
Aug 19 2021 12:27:01 
Copyright (c) 2012 Broadcom
version ef2c018dccdeb94b0376db62a2ea4c882f9b500d (clean) (release) (start)

Bootloader logs

PM_RSTS: 0x00000020
RPi: BOOTLOADER release VERSION:c2f8c388 DATE: Apr 29 2021 TIME: 17:11:29 BOOTMODE: 0x00000006 part: 0 BUILD_TIMESTAMP=1619712685 0xb7f44626 0x00c03111 0x000a4fc4
PM_RSTS: 0x00000020
part 00000000 reset_info 00000000
uSD voltage 3.3V
Initialising SDRAM 'Micron' 16Gb x2 total-size: 32 Gbit 3200
PCI reset
PCI reset
PCI reset
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ports 5 slots 32 intrs 4
Reset USB port-power 1000 ms
xhci_set_port_power 1 0
xhci_set_port_power 2 0
xhci_set_port_power 3 0
xhci_set_port_power 4 0
xhci_set_port_power 5 0
xhci_set_port_power 1 1
xhci_set_port_power 2 1
xhci_set_port_power 3 1
xhci_set_port_power 4 1
xhci_set_port_power 5 1
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ports 5 slots 32 intrs 4
Boot mode: SD (01) order f2
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Retry SD 1
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Boot mode: NETWORK (02) order f
GENET: RESET_PHY
NET_BOOT: dc:a6:32:0e:f7:01 wait for link TFTP: 0.0.0.0
Stopping network
RX: 0 IP: 0 IPV4: 0 MAC: 0 UDP: 0 UDP RECV: 0 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
RX_DISCARD: 0
  GR64: 0x00000000   GR64: 0x00000000  GR127: 0x00000000  GR255: 0x00000000 
 GR511: 0x00000000 GR1023: 0x00000000 GR1518: 0x00000000  GRMGV: 0x00000000 
GR2047: 0x00000000 GR4095: 0x00000000 GR9216: 0x00000000  GRPKT: 0x00000000 
 GRBYT: 0x00000000  GRMCA: 0x00000000  GRBCA: 0x00000000  GRFCS: 0x00000000 
 GRXCF: 0x00000000  GRXPF: 0x00000000  GRXUO: 0x00000000  GRALN: 0x00000000 
 GRFLR: 0x00000000  GRCDE: 0x00000000  GRFCR: 0x00000000  GROVR: 0x00000000 
 GRJBR: 0x00000000 GRMTUE: 0x00000000  GRPOK: 0x00000000   GRUC: 0x00000000 
 GRPPP: 0x00000000  GRCRC: 0x00000000 

  TR64: 0x00000000  TR127: 0x00000000  TR255: 0x00000000  TR511: 0x00000000 
TR1023: 0x00000000 TR1518: 0x00000000  TRMGV: 0x00000000 TR2047: 0x00000000 
TR4095: 0x00000000 TR9216: 0x00000000  GTPKT: 0x00000000  GTMCA: 0x00000000 
 GTBCA: 0x00000000  GTXPF: 0x00000000  GTXCF: 0x00000000  GTFCS: 0x00000000 
 GTOVR: 0x00000000  GTDRF: 0x00000000  GTEDF: 0x00000000  GTSCL: 0x00000000 
 GTMCL: 0x00000000  GTLCL: 0x00000000  GTXCL: 0x00000000  GTFRG: 0x00000000 
 GTNCL: 0x00000000  GTJBR: 0x00000000  GTBYT: 0x00000000  GTPOK: 0x00000000 
  GTUC: 0x00000000 
GENET STOP: 0
NETBOOT CANCEL
NETBOOT init failed
Boot mode: RESTART (0f) order 0
Restart 0 max -1
Boot mode: SD (01) order f2
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Retry SD 1
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 100000 Hz actual: 100000 HZ div: 2500 (1250) status: 0x1fff0000 delay: 1080
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Boot mode: NETWORK (02) order f
Stopping network
Stopping network
NET_BOOT: dc:a6:32:0e:f7:01 wait for link TFTP: 0.0.0.0

At that point it will just loop forever

USB boot

No response

NVMe boot

No response

Network (TFTP boot)

No response

timg236 commented 2 years ago

@pelwell @mripard Please can you add a link to the commit. The Ethernet PHY is supposed to be reset by a watchdog reset so we probably want to revert this kernel commit in the downstream kernel because we can't expect people to upgrade their bootloaders (even if there is a workaround).

mripard commented 2 years ago

I was expecting GitHub to fill it up automatically, but I guess it doesn't work across repos. Here it is: https://github.com/raspberrypi/linux/commit/c3a4c69360ab43560f212eed326c9d8bde35b14c

pelwell commented 2 years ago

Reverted on rpi-5.16.y and rpi-5.17.y.

timg236 commented 2 years ago

Thanks. Unusually Google didn't find it

I think this is the culprit "PHY_BRCM_AUTO_PWRDWN_ENABLE "

The bootloader uses MDIO to do link negotiation but it doesn't turn the PHY on, hopefully, you can turn the PHY on again via MDIO :)

ffainelli commented 2 years ago

Yes, with IDDQ-SR you can still access the PHY over MDIO. The proper fix, rather than revert the change would be to make sure that the PHY is fully functional, but that might be difficult with code masked in the ROM.

Another way to fix this would be to introduce a reboot hook that calls back into the firmware and in turn have the VPU firmware ensure that the PHY is not left in IDDQ-SR.

The power savings with IDDQ-SR are very real and in the order of several hundreds of milli Watts, so I would be inclined to keep them.

pelwell commented 2 years ago

We can drop the reversions once there is some mitigation in the firmware.

timg236 commented 2 years ago

Both the firmware and bootloader will need to be updated for this and this can take a long time to filter through. Enabling this via an overlay might be an option might be useful for anyone who wants this functionality earlier.

pelwell commented 2 years ago

We'll never ship 5.16 builds because it isn't an LTS, but I'm open to back-porting the patch to 5.15 once the dust has settled.

timg236 commented 2 years ago

Sounds good. There probably won't be an immediate update for bootloader / firmware though. Possibly, bootloader first so that it gets included in the next manufacturing release that supports network-install

timg236 commented 2 years ago

@mripard I have an experimental bootloader fix attached. Please could you try this with the kernel revert removed.

However, I'm finding that on 5.18 with the revert-reverted it looks like the auto-power was already disabled by the time bootloader was running.

# Bit 1 is zero so auto power down of DLL is ENABLED
sudo ./miitool w eth0 0x1c 0x1400
141d
GENET: RESET_PHY 0
CTL 1140 PHY ID 600d 84a2
MII SPARE CTRL3: 0x001f

The bootloader has always cleared the bit11 in MII register zero so I'm assuming that the auto-power down bits in the shadow registers had survived OR there was some thing wrong in the sequencing of PHY reset / clearing power-down.

pieeprom-issue-417.zip

timg236 commented 2 years ago

Actually, I think this only fails on Pi4B 1.1 or older because I can reproduce the problem on that board.

ffainelli commented 2 years ago

Was the system you tested before and the Pi4B 1.1 or older use different Ethernet PHYs?

timg236 commented 2 years ago

Was the system you tested before and the Pi4B 1.1 or older use different Ethernet PHYs?

All Pi4 model B use the same ethernet PHY, however, the external reset implementation changed and I think that affects both the ethernet and PCIe. I'm going to try clearing this from the firmware before the reboot. Unfortunately, just setting the reset bit in ctrl 0 or disabling APD in the bootloader isn't enough. The link is up but no packets sent/received.

mripard commented 2 years ago

I tested your new eeprom build, and it's still stuck at boot waiting for the link. It's not clear to me how you can get the revision, but if I'm to trust the machine model passed to Linux, I have a rev 1.1

lurch commented 2 years ago

It's not clear to me how you can get the revision

https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#raspberry-pi-revision-codes

timg236 commented 2 years ago

Thanks @mripard the board revision is on the second line of the bootloader log output (0x00c03111)

I think this confirms that this is a 1.1 revision issue which is good because it doesn't impact CM4 but the fix is not clear yet.

mripard commented 2 years ago

The board revision is c03111 so RPi4 1.1 indeed

timg236 commented 2 years ago

@mripard I was able to reproduce the issue on a RPI4 1.1 and this version seems to resolve the issue. Please could you try it in your setup? We'll need to add the reboot hook for older bootloaders as well.

pieeprom-genet-reset.zip

mripard commented 2 years ago

Unfortunately, it doesn't work either. If that makes any difference, I flashed it that way (using the current RaspberryPi OS build):

root@raspberrypi:~# cat bootconf.txt 
[all]
BOOT_UART=1
WAKE_ON_GPIO=1
POWER_OFF_ON_HALT=0
BOOT_ORDER=0x21
root@raspberrypi:~# md5sum pieeprom.bin 
8922fcb817c841d8dcb8e23e1c6886d0  pieeprom.bin
root@raspberrypi:~# rpi-eeprom-config --out netboot-pieeprom.bin --config bootconf.txt pieeprom.bin
root@raspberrypi:~# rpi-eeprom-update -d -f ./netboot-pieeprom.bin
*** INSTALLING ./netboot-pieeprom.bin  ***

   CURRENT: Wed 13 Apr 15:58:38 UTC 2022 (1649865518)
    UPDATE: Wed 13 Apr 15:58:38 UTC 2022 (1649865518)
    BOOTFS: /boot

EEPROM updates pending. Please reboot to apply the update.
To cancel a pending update run "sudo rpi-eeprom-update -r".

I rebooted, waited for the system to boot, and then rebooted again into a 5.18 kernel I built (using TFTP).

RPi: BOOTLOADER release VERSION:c68f0c06 DATE: 2022/04/13 TIME: 16:58:38 BOOTMODE: 0x00000006 part: 0 BUILD_TIMESTAMP=1649865518 0xb7f44626 0x00c03111 0x0005bb58
PM_RSTS: 0x00001000
part 00000000 reset_info 00000000
uSD voltage 3.3V
Initialising SDRAM 'Micron' 16Gb x2 total-size: 32 Gbit 3200
DDR 3200 1 0 32 152

XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 11
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ports 5 slots 32 intrs 4
Reset USB port-power 1000 ms
xhci_set_port_power 1 0
xhci_set_port_power 2 0
xhci_set_port_power 3 0
xhci_set_port_power 4 0
xhci_set_port_power 5 0
xhci_set_port_power 1 1
xhci_set_port_power 2 1
xhci_set_port_power 3 1
xhci_set_port_power 4 1
xhci_set_port_power 5 1
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 18
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 19
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ports 5 slots 32 intrs 4
Boot mode: SD (01) order 2
USB2[1] 400202e1 connected
USB2 root HUB port 1 init
DEV [01:00] 2.16 000000:01 class 9 VID 2109 PID 3431
HUB init [01:00] 2.16 000000:01
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Retry SD 1
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Boot mode: NETWORK (02) order 0
GENET: RESET_PHY 0
CTL 1140 PHY ID 600d 84a2
MII_CONTROL 1140
GENET: RESET_PHY 0
CTL 1000 PHY ID 600d 84a2
MII_CONTROL 1000
NET_BOOT: dc:a6:32:0e:f7:01 wait for link TFTP: 0.0.0.0
HDMI0 edid block 0 offset 0
00ffffffffffff005304000001010101
003a010380000078ea53a5a756529c26
1150540108008100814081809500b300
8bc0a9400101023a801871382d40582c
450000000000001e7f2156aa51001e30
468f3300142b2100001e000000fd0038
3e1e5319000a202020202020000000fc
0048444d490a2020202020202020017d
HDMI0 edid block 1 offset 128
020317f1444504030223097f07830100
0065030c001000023a801871382d4058
2c4500142b2100001e011d007251d01e
206e285500142b2100001e8c0ad08a20
e02d10103e9600142b21000018011d80
18711c1620582c2500142b2100009e00
00000000000000000000000000000000
00000000000000000000000000000075
HDMI0: best-mode 1 (limit 1) 1280x720 60 Hz CEA modes 1c000100000000002000000000000000
LINK STATUS: speed: 1000 full duplex
Link ready
GENET START: 64 16 32
GENET: UMAC_START 0xdca6320e 0xf7010000
RX: 0 IP: 0 IPV4: 0 MAC: 0 UDP: 0 UDP RECV: 0 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
DHCP src: 5e:0f:5b:3f:b8:0d 192.168.20.254
YI_ADDR 192.168.20.87
OPTIONS:-
    op: 53 len:   1 DHCP recv OFFER (2) expect OFFER
    op: 54 len:   4 192.168.20.254
    op: 51 len:   4 
    op:  1 len:   4 255.255.255.0
    op:  3 len:   4 192.168.20.254
DHCP src: b4:2e:99:3d:09:1d 192.168.20.10
OPTIONS:-
    op: 53 len:   1 DHCP recv OFFER (2) expect OFFER
    op: 54 len:   4 192.168.20.10
    op: 60 len:   9 
    op: 97 len:  17 
    op: 43 len:  32 
        PXE op:  6 len:   1 
        PXE op: 10 len:   4 
        PXE op:  9 len:  20 
00000000  52 61 73 70 62 65 72 72  79 20 50 69 20 42 6f 6f |Raspberry Pi Boo|
00000010  74 ff ff 00                                      |t...            |

NET 192.168.20.87 255.255.255.0 gw 0.0.0.0 tftp 192.168.20.10
ARP 192.168.20.10 b4:2e:99:3d:09:1d
NET 192.168.20.87 255.255.255.0 gw 0.0.0.0 tftp 192.168.20.10
RX: 13 IP: 0 IPV4: 2 MAC: 2 UDP: 2 UDP RECV: 2 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
DHCP src: 5e:0f:5b:3f:b8:0d 192.168.20.254
YI_ADDR 192.168.20.87
OPTIONS:-
    op: 53 len:   1 DHCP recv ACK (5) expect ACK
    op: 54 len:   4 192.168.20.254
    op: 51 len:   4 
    op:  1 len:   4 255.255.255.0
    op:  3 len:   4 192.168.20.254
    op:  6 len:   4 1.1.1.1
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/start4.elf

RX: 14 IP: 0 IPV4: 3 MAC: 3 UDP: 3 UDP RECV: 3 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
RX: 15 IP: 0 IPV4: 4 MAC: 4 UDP: 4 UDP RECV: 4 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/config.txt

RX: 15 IP: 0 IPV4: 4 MAC: 4 UDP: 4 UDP RECV: 4 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP: complete 1540
TFTP: disconnect: timeouts 1
RX: 21 IP: 0 IPV4: 7 MAC: 7 UDP: 7 UDP RECV: 7 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
Read b7f44626/config.txt bytes     1540 hnd 0x0
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/vl805.sig

RX: 21 IP: 0 IPV4: 7 MAC: 7 UDP: 7 UDP RECV: 7 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP: complete 65
TFTP: disconnect: timeouts 1
RX: 23 IP: 0 IPV4: 9 MAC: 9 UDP: 9 UDP RECV: 9 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
SIG b7f44626/vl805.sig 6246230ecd5b472902e6a49c95e857a5e3190c4fa6c462d6a8867e9a5e523a7c 0
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/vl805.bin

RX: 23 IP: 0 IPV4: 9 MAC: 9 UDP: 9 UDP RECV: 9 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP: complete 99224
TFTP: disconnect: timeouts 1
RX: 121 IP: 0 IPV4: 107 MAC: 107 UDP: 107 UDP RECV: 107 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
VL805 EEPROM is up to date
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/pieeprom.sig

RX: 121 IP: 0 IPV4: 107 MAC: 107 UDP: 107 UDP RECV: 107 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP 1: file /srv/tftp/b7f44626/pieeprom.sig not found 
TFTP: disconnect: timeouts 2
RX: 123 IP: 0 IPV4: 108 MAC: 108 UDP: 108 UDP RECV: 108 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
Read b7f44626/pieeprom.upd failed
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/recover4.elf

RX: 123 IP: 0 IPV4: 108 MAC: 108 UDP: 108 UDP RECV: 108 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP 1: file /srv/tftp/b7f44626/recover4.elf not found 
TFTP: disconnect: timeouts 2
RX: 124 IP: 0 IPV4: 109 MAC: 109 UDP: 109 UDP RECV: 109 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
Read b7f44626/recover4.elf failed
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/recovery.elf

RX: 124 IP: 0 IPV4: 109 MAC: 109 UDP: 109 UDP RECV: 109 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP 1: file /srv/tftp/b7f44626/recovery.elf not found 
TFTP: disconnect: timeouts 2
RX: 125 IP: 0 IPV4: 110 MAC: 110 UDP: 110 UDP RECV: 110 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
Read b7f44626/recovery.elf failed
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/start4.elf

RX: 125 IP: 0 IPV4: 110 MAC: 110 UDP: 110 UDP RECV: 110 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
ARP TX resp: sender: dc:a6:32:0e:f7:01 192.168.20.87 target: 192.168.20.10
Duplicate block 2153 expected 2154
TFTP: complete 2240512
TFTP: disconnect: timeouts 3
RX: 16 IP: 0 IPV4: 253 MAC: 253 UDP: 253 UDP RECV: 253 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
Read b7f44626/start4.elf bytes  2240512 hnd 0x0
TFTP_GET: b4:2e:99:3d:09:1d 192.168.20.10 b7f44626/fixup4.dat

RX: 16 IP: 0 IPV4: 253 MAC: 253 UDP: 253 UDP RECV: 253 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
TFTP: complete 5406
TFTP: disconnect: timeouts 3
RX: 23 IP: 0 IPV4: 4 MAC: 4 UDP: 4 UDP RECV: 4 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
Read b7f44626/fixup4.dat bytes     5406 hnd 0x0
Firmware: ef2c018dccdeb94b0376db62a2ea4c882f9b500d Aug 19 2021 12:27:01
0x00c03111 0x00000000 0x00000fff
MEM GPU: 76 ARM: 948 TOTAL: 1024
Starting start4.elf @ 0xfec00200 partition -1
Stopping network
RX: 23 IP: 0 IPV4: 4 MAC: 4 UDP: 4 UDP RECV: 4 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
RX: 23 IP: 0 IPV4: 4 MAC: 4 UDP: 4 UDP RECV: 4 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
GENET STOP: 0
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 18
+
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]

I rebooted once again, and here's the output:

RPi: BOOTLOADER release VERSION:c68f0c06 DATE: 2022/04/13 TIME: 16:58:38 BOOTMODE: 0x00000006 part: 0 BUILD_TIMESTAMP=1649865518 0xb7f44626 0x00c03111 0x0005bb58
PM_RSTS: 0x00001020
part 00000000 reset_info 00000000
uSD voltage 3.3V
Initialising SDRAM 'Micron' 16Gb x2 total-size: 32 Gbit 3200
DDR 3200 1 0 32 152

XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 1
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ports 5 slots 32 intrs 4
Reset USB port-power 1000 ms
xhci_set_port_power 1 0
xhci_set_port_power 2 0
xhci_set_port_power 3 0
xhci_set_port_power 4 0
xhci_set_port_power 5 0
xhci_set_port_power 1 1
xhci_set_port_power 2 1
xhci_set_port_power 3 1
xhci_set_port_power 4 1
xhci_set_port_power 5 1
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 18
XHCI-STOP
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
USBSTS 19
xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
xHC ports 5 slots 32 intrs 4
Boot mode: SD (01) order 2
USB2[1] 400202e1 connected
USB2 root HUB port 1 init
DEV [01:00] 2.16 000000:01 class 9 VID 2109 PID 3431
HUB init [01:00] 2.16 000000:01
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Retry SD 1
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Boot mode: NETWORK (02) order 0
GENET: RESET_PHY 0
CTL 0800 PHY ID 600d 84a2
MII_CONTROL 0800
GENET: RESET_PHY 0
CTL 1000 PHY ID 600d 84a2
MII_CONTROL 1000
NET_BOOT: dc:a6:32:0e:f7:01 wait for link TFTP: 0.0.0.0
HDMI0 edid block 0 offset 0
00ffffffffffff005304000001010101
003a010380000078ea53a5a756529c26
1150540108008100814081809500b300
8bc0a9400101023a801871382d40582c
450000000000001e7f2156aa51001e30
468f3300142b2100001e000000fd0038
3e1e5319000a202020202020000000fc
0048444d490a2020202020202020017d
HDMI0 edid block 1 offset 128
020317f1444504030223097f07830100
0065030c001000023a801871382d4058
2c4500142b2100001e011d007251d01e
206e285500142b2100001e8c0ad08a20
e02d10103e9600142b21000018011d80
18711c1620582c2500142b2100009e00
00000000000000000000000000000000
00000000000000000000000000000075
HDMI0: best-mode 1 (limit 1) 1280x720 60 Hz CEA modes 1c000100000000002000000000000000
Stopping network
RX: 0 IP: 0 IPV4: 0 MAC: 0 UDP: 0 UDP RECV: 0 IP_CSUM_ERR: 0 UDP_CSUM_ERR: 0
GENET STOP: 0
NETBOOT CANCEL
NETBOOT init failed
Boot mode: SD (00) order 0
Insert SD-CARD
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD HOST: 250000000 CTL0: 0x00000f00 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
EMMC
SD retry 1 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD retry 2 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SDV1
SD retry 3 oc 0
SD HOST: 250000000 CTL0: 0x00000000 BUS: 200000 Hz actual: 200000 HZ div: 1250 (625) status: 0x1fff0000 delay: 540
SD CMD: 0x371a0010 (55) 0x0 0x1fff0001
Failed to open device: 'sdcard' (cmd 371a0010 status 1fff0001)
Boot mode: SD (00) order 0
Insert SD-CARD
timg236 commented 2 years ago

Thanks. It's interesting that the PHY is in the power down case in the failure 'MII_CONTROL 0800' (bit 11).

I was testing from an SD boot then rebooting into network boot and also forced the power off state via mii_tool. I'll try a reboot from within a network boot to see what's different.

timg236 commented 2 years ago

I think I can see what's happening here. Pi4 R1.1 does not do a hardware-reset of the ethernet PHY on a reboot, however, having a UHS SD card present may cause a PMIC reset for the voltage switch which might have the same effect.

I was able to show the unrecoverable failure case by setting IDDQ-LP bit of the IDDQ expansion register.

However, since I think the kernel is only setting the IDDQ-SD and IDDQ_SR bits this should be possible to get out of this state by following the correct soft-reset sequence which the bootloader should now do. The busier switch that I have my test board connected to at PiTowers made it appear to work because it never entered the ULPS mode.

This seems to work for me and shows the same CTL status but I'll test this more next week. Thanks for the testing! CTL 0800 PHY ID 600d 84a2 MII_CONTROL 0800 MII APD 0021 SCTL3 0021

pieeprom-507b2360.zip

mripard commented 2 years ago

The latest build works perfectly, thanks!

timg236 commented 2 years ago

Thanks that's a relief. I'd removed the explicit clearing of the IDDQ register thinking it was unnecessary and the soft-reset was sufficient ... later, I found the requirement elsewhere in the docs :)