Hexxeh / rpi-firmware

Firmware files for the Raspberry Pi
Other
775 stars 208 forks source link

PI 4 cannot boot from USB anymore #258

Closed Thomas-8-Bit closed 3 years ago

Thomas-8-Bit commented 3 years ago

After update from 5.4 to 5.10.20-v7l+ the PI 4 stucks on boot from USB with the color splash screen. Green LED flashes 7 times, so it says kernel.img not found. But the files are still within the boot partition. After downgrade to 5.4.x it works again!

Booting from SD works all the time with 5.10.20. Only USB Boot is not working anymore.

jordipalet commented 3 years ago

Got the same problem today after using rpi-update

CaptainMidnight commented 3 years ago

Suggest it maybe the latest commit i.e. sudo rpi-update 48570ba954a318feee348d4e642ebd2b58d9dd97 that could be breaking USB/microSD card boot in my testing.

To achieve a kernel update to 5.10.20 suggest using - sudo rpi-update e150906874ff8b9fb6271971fa4238997369f790

Reference: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=288234&start=550#p1831846

Thomas-8-Bit commented 3 years ago

To achieve a kernel update to 5.10.20 suggest using - sudo rpi-update e150906

Reference: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=288234&start=550#p1831846

Yes. That worked. Seems to be the latest commit.

timg236 commented 3 years ago

I've attached a possible firmware fix. it works for me but it would be useful to get feedback before updating rpi-update.

usb-bulk-read-fix.zip

pelwell commented 3 years ago

Anyone feeling uncertain about applying the trial firmware can confirm that this is indeed a firmware issue using sudo rpi-update to the get the latest version then (no need to reboot first) sudo SKIP_KERNEL=1 rpi-update e150906 to downgrade just the firmware.

CaptainMidnight commented 3 years ago

I've attached a possible firmware fix. it works for me but it would be useful to get feedback before updating rpi-update.

usb-bulk-read-fix.zip

Currently testing..... working but not efficiently - this fix allows the reboot after rpi-update to be successful BUT adds in approximately a 90sec delay on any subsequent boot.

Pre-fix condition running 5.10.20-v8+ using commit e150906 - standard boot time 6.8sec approx Post-fix condition running 5.10.20-v8+ using latest build + firmware fix - boot time now 95sec approx

CaptainMidnight commented 3 years ago

@timg236 give me a shout if you have any additional testing that I coluld assist with.

timg236 commented 3 years ago

The output of "vcdbg log msg" and "vcgencmd version" after a slow boot would be useful

CaptainMidnight commented 3 years ago

The output of "vcdbg log msg" and "vcgencmd version" after a slow boot would be useful

Is "vcdbg" installed by default or in a specific directory - command not found.

pi@phoenix-pi-x64:~ $ vcgencmd version Mar 7 2021 12:29:36 Copyright (c) 2012 Broadcom version cc98c2d0c5ad2f5fa1a1c24c6db89fb420d5840d (tainted) (release) (start)

pelwell commented 3 years ago

The binary should be in /opt/vc/bin/, with a symbolic link installed to /usr/bin/vcdbg.

CaptainMidnight commented 3 years ago

The binary should be in /opt/vc/bin/, with a symbolic link installed to /usr/bin/vcdbg.

For whatever reason, vcdbg exists in /opt/vc/bin/ but there is no link installed in /usr/bin/vcdbg, even copying vcdbg to /usr/bin/ fails to run. The vcgencmd file in /opt/vc/bin/ is also different in size and date to the version in /usr/bin/ - it isn't a link it's an actual file.

Any other suggestions on how to run vcdbg?

pelwell commented 3 years ago

/opt/vc/bin/vcdbg log msg?

Which OS are you running?

CaptainMidnight commented 3 years ago

PiOS 64bit, that command gives the same output - no such file or directory.

pelwell commented 3 years ago

It's failing because of missing libraries. Try this static build: https://drive.google.com/file/d/1HS9E5vnxxNqrizB4mEYrnFoQQ1axSRKm/view?usp=sharing

CaptainMidnight commented 3 years ago

The command output should be here: https://www.dropbox.com/s/3xq97rfeb74xfo9/output.txt?dl=0

timg236 commented 3 years ago

Thanks. It looks as though the USB issue is fixed but something weird is happening with HDMI EDID read with the new HDMI I2C driver, unless your display has a large EDID and is also reporting errors.

The short-term workaround is to use hdmi_ignore_edid and hdmi_hotplug (or revert the firmware) https://www.raspberrypi.org/documentation/configuration/config-txt/video.md

@6by9 Does the number of EDID blocks plausible to you?

6by9 commented 3 years ago
067250.735: hdmi: HDMI:EDID bad checksum 184 in block 82 retries: 9
067250.773: hdmi: 00ffffff ffffff00 4c2d0c07 00000000
067250.810: hdmi: 0b140103 80301b78 0a3581a6 56489a24
067250.846: hdmi: 00ffffff ffffff00 4c2d0c07 00000000
067250.884: hdmi: 0b140103 80301b78 0a3581a6 56489a24
067250.919: hdmi: 00ffffff ffffff00 4c2d0c07 00000000
067250.956: hdmi: 0b140103 80301b78 0a3581a6 56489a24
067250.993: hdmi: 00ffffff ffffff00 4c2d0c07 00000000
067251.030: hdmi: 0b140103 80301b78 0a3581a6 56489a24
...
067277.147: hdmi: 020323f1 4b930405 14031210 1f202122
067277.184: hdmi: 23090707 83010000 e2000f67 030c0010
067277.221: hdmi: 020323f1 4b930405 14031210 1f202122
067277.259: hdmi: 23090707 83010000 e2000f67 030c0010
067277.296: hdmi: 020323f1 4b930405 14031210 1f202122
067277.332: hdmi: 23090707 83010000 e2000f67 030c0010
067277.370: hdmi: 020323f1 4b930405 14031210 1f202122
067277.407: hdmi: 23090707 83010000 e2000f67 030c0010

I2C has to do 32byte reads, and it looks like that monitor isn't incrementing the EDID read address for the subsequent reads - the pattern repeats every 32 bytes when trying to do a 128byte read. I thought there were only 2 blocks to be read, so where blocks up to 154 comes from is odd - time to reread the spec.

CaptainMidnight commented 3 years ago

Thanks. It looks as though the USB issue is fixed but something weird is happening with HDMI EDID read with the new HDMI I2C driver, unless your display has a large EDID and is also reporting errors.

The short-term workaround is to use hdmi_ignore_edid and hdmi_hotplug (or revert the firmware) https://www.raspberrypi.org/documentation/configuration/config-txt/video.md

@6by9 Does the number of EDID blocks plausible to you?

I can report after adding the following to the /boot/config.txt configuration, boot speed is back to normal at 7sec approx.

hdmi_force_hotplug=1
hdmi_ignore_edid=0xa5000080

Thanks for the short-term workaround solution and firmware fix files.

6by9 commented 3 years ago

@CaptainMidnight Could you dump your EDID out and attach it please?

The number of extension blocks is in byte 126 of the base EDID. I suspect it's the misreading of the base EDID that means we get an invalid number of extensions, so it keeps on going and trying to read the wrong number of blocks (probably the value that is in byte 30 due to the repitition).

CaptainMidnight commented 3 years ago

Using: -

tvservice -d edid.bin
base64 edid.bin

I get the following: -

AP///////wBMLbAFAAAAAA0TAQOAEAl4Cu6Ro1RMmSYPUFS/74BxT4EAgUCBgJUAlQ+pQLMAAjqA
GHE4LUBYLEUAoFoAAAAeAR0AvFLQHiC4KFVAoFoAAAAeAAAA/QAYSxpRFwAKICAgICAgAAAA/ABT
eW5jTWFzdGVyCiAgAdECAyPxS5MEBRQDEhAfICEiIwkHB4MBAADiAA9nAwwAEAC4LQEdgNByHBYg
ECwlgKBaAAAAngEdgBhxHBYgWCwlAKBaAAAAngEdAHJR0B4gbihVAKBaAAAAHowK0JAgQDEgDEBV
AKBaAAAAGIwK0Iog4C0QED6WAKBaAAAAGAAAVw==

For info my setup actually consists of a dual port HDMI KVM switch unit, although any potential existing hidden EDID issues have not materialised previously under any previous firmware builds. I can also upload the actual edid.bin file and/or provide the output with a direct connection to the Samsung FHD TV if rquired.

Update: edid.bin link https://www.dropbox.com/s/2ubhjep15r8pnqo/edid.bin?dl=0 (KVM in circuit)

CaptainMidnight commented 3 years ago

My setup also is configured to force only FHD (1920x1024) mode, although the HDMI KVM is always attached, the TV/monitor is generally powered off for remote headless operation.

Now using: -

 hdmi_force_hotplug=1
 hdmi_ignore_edid=0xa5000080

Provides the following vcdbg output with zero additional boot time increase: -

pi@phoenix-pi-x64:~ $ sudo vcdbg log msg
004186.136: brfs: File read: /mfs/sd/config.txt
004186.901: brfs: File read: 1093 bytes
004252.832: HDMI1:EDID error reading EDID block 0 attempt 0
004258.855: HDMI1:EDID error reading EDID block 0 attempt 1
004264.879: HDMI1:EDID error reading EDID block 0 attempt 2
004270.900: HDMI1:EDID error reading EDID block 0 attempt 3
004276.923: HDMI1:EDID error reading EDID block 0 attempt 4
004282.945: HDMI1:EDID error reading EDID block 0 attempt 5
004288.968: HDMI1:EDID error reading EDID block 0 attempt 6
004294.989: HDMI1:EDID error reading EDID block 0 attempt 7
004301.013: HDMI1:EDID error reading EDID block 0 attempt 8
004307.035: HDMI1:EDID error reading EDID block 0 attempt 9
004308.049: HDMI1:EDID giving up on reading EDID block 0
004309.081: brfs: File read: /mfs/sd/config.txt
004324.600: brfs: File read: 1093 bytes
004814.394: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined
004816.703: *** Restart logging
004830.917: hdmi: HDMI1:EDID error reading EDID block 0 attempt 0
004836.943: hdmi: HDMI1:EDID error reading EDID block 0 attempt 1
004842.966: hdmi: HDMI1:EDID error reading EDID block 0 attempt 2
004851.872: hdmi: HDMI1:EDID error reading EDID block 0 attempt 3
004857.902: hdmi: HDMI1:EDID error reading EDID block 0 attempt 4
004866.864: hdmi: HDMI1:EDID error reading EDID block 0 attempt 5
004873.059: hdmi: HDMI1:EDID error reading EDID block 0 attempt 6
004879.090: hdmi: HDMI1:EDID error reading EDID block 0 attempt 7
004885.120: hdmi: HDMI1:EDID error reading EDID block 0 attempt 8
004891.151: hdmi: HDMI1:EDID error reading EDID block 0 attempt 9
004892.169: hdmi: HDMI1:EDID giving up on reading EDID block 0
004897.224: hdmi: HDMI1:EDID error reading EDID block 0 attempt 0
004903.256: hdmi: HDMI1:EDID error reading EDID block 0 attempt 1
004909.285: hdmi: HDMI1:EDID error reading EDID block 0 attempt 2
004915.315: hdmi: HDMI1:EDID error reading EDID block 0 attempt 3
004921.345: hdmi: HDMI1:EDID error reading EDID block 0 attempt 4
004927.377: hdmi: HDMI1:EDID error reading EDID block 0 attempt 5
004933.405: hdmi: HDMI1:EDID error reading EDID block 0 attempt 6
004939.437: hdmi: HDMI1:EDID error reading EDID block 0 attempt 7
004945.467: hdmi: HDMI1:EDID error reading EDID block 0 attempt 8
004951.499: hdmi: HDMI1:EDID error reading EDID block 0 attempt 9
004952.517: hdmi: HDMI1:EDID giving up on reading EDID block 0
004952.539: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead
004952.556: HDMI0: hdmi_pixel_encoding: 600000000
004952.571: HDMI1: hdmi_pixel_encoding: 162000000
004957.639: dtb_file 'bcm2711-rpi-4-b.dtb'
004960.418: brfs: File read: /mfs/sd/bcm2711-rpi-4-b.dtb
004960.437: Loading 'bcm2711-rpi-4-b.dtb' to 0x100 size 0xbfc2
004976.150: brfs: File read: 49090 bytes
004982.111: brfs: File read: /mfs/sd/overlays/overlay_map.dtb
005060.617: brfs: File read: 1523 bytes
005062.216: brfs: File read: /mfs/sd/config.txt
005062.383: dtparam: i2c_arm=off
005072.808: dtparam: i2s=off
005082.474: dtparam: spi=off
005092.615: brfs: File read: 1093 bytes
005101.039: brfs: File read: /mfs/sd/overlays/vc4-kms-v3d-pi4.dtbo
005178.792: Loaded overlay 'vc4-kms-v3d'
005369.147: brfs: File read: 3831 bytes
005371.445: brfs: File read: /mfs/sd/overlays/disable-bt.dtbo
005389.003: Loaded overlay 'disable-bt'
005428.751: brfs: File read: 1073 bytes
005429.416: brfs: File read: /mfs/sd/overlays/disable-wifi.dtbo
005440.881: Loaded overlay 'disable-wifi'
005441.059: dtparam: watchdog=on
005451.866: dtparam: act_led_trigger=timer
005490.660: brfs: File read: 387 bytes
005491.743: brfs: File read: /mfs/sd/cmdline.txt
005491.783: Read command line from file 'cmdline.txt':
005491.797: 'root=PARTUUID=66f19b0b-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait loglevel=3 quiet logo.nologo'
006386.172: brfs: File read: 114 bytes
006694.101: brfs: File read: /mfs/sd/kernel8.img
006694.124: Loading 'kernel8.img' to 0x80000 size 0x771c73
008070.038: Kernel relocated to 0x200000
008070.054: Device tree loaded to 0x2eff3a00 (size 0xc59c)
008074.309: bfs_xhci_stop
008074.319: XHCI-STOP
008074.552: xHC ver: 256 HCS: 05000420 fc000031 00e70004 HCC: 002841eb
008074.602: USBSTS 18
009113.439: vchiq_core: vchiq_init_state: slot_zero = 0xdf000000, is_master = 1
009117.180: TV service:host side not connected, dropping notification 0x00000002, 0x00000001, 0x00000010
6by9 commented 3 years ago

Thanks, that just allows me to confirm that my suspicions were right. The hex dump of the start of that EDID is

00 ff ff ff ff ff ff 00 4c 2d b0 05 00 00 00 00
0d 13 01 03 80 10 09 78 0a ee 91 a3 54 4c 99 26

Your log reported querying up to block 154, which is hex 0x9a. The 0x99 as the penultimate byte there is the number of blocks after this one, so 154 blocks total.

Now to work out why the EDID reading is failing when we're now doing almost exactly the same as the kernel is doing. More bizarre is that tvservice has then reported the EDID correctly (unless that is with a reverted firmware).

CaptainMidnight commented 3 years ago

Your log reported querying up to block 154, which is hex 0x9a. The 0x99 as the penultimate byte there is the number of blocks after this one, so 154 blocks total.

Now to work out why the EDID reading is failing when we're now doing almost exactly the same as the kernel is doing. More bizarre is that tvservice has then reported the EDID correctly (unless that is with a reverted firmware).

Just to clarify, the original vcdbg output provided (output.txt) was for the original /boot/config.txt configuration but with the latest rpi-update and suggested firmware fix installed.

The latest vcdbg output in my last comment and the edib.bin data is from the same setup with the only change being hdmi_ignore_edid=0xa5000080 added to /boot/config.txt (I already had hdmi_force_hotpug=1 in my config).

pelwell commented 3 years ago

Perhaps you can continue the EDID discussion at https://github.com/raspberrypi/firmware/issues/1548 so we can close the original USB boot issue?

CaptainMidnight commented 3 years ago

No issues from me as the original boot failure, after applying the interim firmware fix has transformed into issues with EDID.

pelwell commented 3 years ago

Thanks - consider this issue closed (I don't actually have permission on this repo).

timg236 commented 3 years ago

@popcornmix I can't close this either

popcornmix commented 3 years ago

rpi-update firmware now contains the fix

jordipalet commented 3 years ago

Question. If something like this happens again (no boot from USB), there is a way, mounting the USB in another Linux, to "go back" to the previous firmware?

popcornmix commented 3 years ago

You can run rpi-update <hash> on another linux machine using BOOT_PATH and ROOT_PATH. Then you can copy firmware files from boot directory over. See: https://github.com/Hexxeh/rpi-update

jordipalet commented 3 years ago

You can run rpi-update <hash> on another linux machine using BOOT_PATH and ROOT_PATH. Then you can copy firmware files from boot directory over. See: https://github.com/Hexxeh/rpi-update

I always keep a copy of my USB boot in an SD card, using rpi-clone, so if I understood correctly, I can unplug the USB, boot from the SD, then plug the USB, mount /boot and just copy everything in the SD /boot to the USB /boot, and reboot, right?

(of course, I will need to edit cmdline.txt to use sda

or just the kernel. files, or something else will do ?

Tks!

popcornmix commented 3 years ago

That should work. Copying all files should be fine, but it's the start.elf, fixup.dat files that contain the regression (and fix with update).