raspberrypi / rpi-eeprom

Installation scripts and binaries for the Raspberry Pi 4 and Raspberry Pi 5 bootloader EEPROMs
https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#raspberry-pi-boot-eeprom
Other
1.27k stars 203 forks source link

USB ethernet dongles not connecting on boot but do connect if physically reconnected. Since 2022-02 bootloaders #472

Closed michael-suissa closed 1 year ago

michael-suissa commented 1 year ago

Describe the bug

On a Raspberry Pi 4B with 4G RAM, On any stable bootloader after 2022-01-25, The PI cannot properly connect to or use a TP-Link UE300 or TP-Link UE200 on boot. But can do so if the USB device is physically removed and reinserted.

The symptoms of the UE300 (a USB3 device) and UE200 (USB2 device) are different. The UE300 appears using lsusb, but has the wrong product ID (8151 instead of 6001). The UE200 doesn't appear at all.

I have documented my journey on my questions on the forum: https://forums.raspberrypi.com/viewtopic.php?p=2094094#p2094094

The issues are present on ubuntu 20.04 (the OS I use in production) and on Raspberry PI OS, which I used for testing and to upgrade the bootloader to the beta versions to isolate the specific start version.

I found that for the following bootloaders: 2022-02-04: both UE200 and UE300 worked fine 2022-02-16: UE200 failed to appear but UE300 worked fine 2022-02-28: Both failed to appear. The UE300 would appear if removed and reinserted but the UE200 would not.

Steps to reproduce the behaviour

With a PI 4B 4GB with the latest bootloader (2023-01-11) Plug a TP-LINK UE300 (and or TP-LINK UE200) into the usb ports of the PI Boot up the PI into raspberry Pi OS or ubuntu 20.04 either directly with screen and keyboard, or via ssh on a terminal, run: lsusb

lsusb would show: Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 002: ID 2357:8151 TP-Link USB 10/100/1000 LAN Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

The UE300 device would be recognized at boot, but with the wrong productID (8151 instead of 6001) A workaround is to run: sudo usb_modeswitch -v 2357 -p 8151 --reset-usb

This workaround will not work for the UE200 which doesn't appear and so cannot be reset.

Reverting to the bootloader: 2022-01-25 resolves the problem. Both USB devices are visible and work fine.

Device (s)

Raspberry Pi 4 Mod. B

Bootloader configuration.

[all] BOOT_UART=0 WAKE_ON_GPIO=1 ENABLE_SELF_UPDATE=1 BOOT_ORDER=0xf41

I also tried with [all] BOOT_UART=0 WAKE_ON_GPIO=1 ENABLE_SELF_UPDATE=1 BOOT_ORDER=0xf41 USB_MSD_PWR_OFF_TIME=0

System

Raspberry Pi reference 2023-02-01 Generated using pi-gen, https://github.com/pi-gen, f2d385517c9631f2ded876deb1115725d0c75995, stage4

$ vcgencmd bootloader_version 2023/01/11 17:40:52 version 8ba17717fbcedd4c3b6d4bce7e50c7af4155cba9 (release) timestamp 1673458852 update-time 1679424592 capabilities 0x0000007f

$ vcgencmd version Jan 5 223 10:46:54 Copyright (c) 2012 Broadcom version 8ba17717fbcedd4c3b6d4bce7e50c7af4155cba9 (clean) (release) (start)

$ uname -a Linux raspberrypi 5.15.84.v7l+ #1613 SMP Thu Jan 5 12:01:26 GMT 2023 armv7l GNU/Linux

Bootloader logs

I'm happy to set up a netconsole if it is necessary. In the meantime here are the boot logs from dmesg.

[Mar16 23:10] Booting Linux on physical CPU 0x0000000000 [0x410fd083] [ +0.000000] Linux version 5.4.0-1081-raspi (buildd@bos01-arm64-012) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #92-Ubunt> [ +0.000000] Machine model: Raspberry Pi 4 Model B Rev 1.5 [ +0.000000] efi: Getting EFI parameters from FDT: [ +0.000000] efi: UEFI not found. ... ... [ +0.176510] usb_phy_generic phy: phy supply vcc not found, using dummy regulator [ +0.100416] dwc2 fe980000.usb: fe980000.usb supply vusb_d not found, using dummy regulator [ +0.000063] dwc2 fe980000.usb: fe980000.usb supply vusb_a not found, using dummy regulator [ +0.018774] usb 1-1.1: new full-speed USB device number 3 using xhci_hcd [ +0.080214] usb 1-1.1: device descriptor read/64, error -32 [ +0.107800] dwc2 fe980000.usb: EPs: 8, dedicated fifos, 4080 entries in SPRAM [ +0.000299] dwc2 fe980000.usb: DWC OTG Controller [ +0.000017] dwc2 fe980000.usb: new USB bus registered, assigned bus number 3 [ +0.000027] dwc2 fe980000.usb: irq 18, io mem 0xfe980000 [ +0.000149] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.04 [ +0.000006] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ +0.000003] usb usb3: Product: DWC OTG Controller [ +0.000004] usb usb3: Manufacturer: Linux 5.4.0-1081-raspi dwc2_hsotg [ +0.000003] usb usb3: SerialNumber: fe980000.usb [ +0.000758] hub 3-0:1.0: USB hub found [ +0.000029] hub 3-0:1.0: 1 port detected [ +0.086836] usb 1-1.1: device descriptor read/64, error -32 [ +0.191779] usb 1-1.1: new full-speed USB device number 4 using xhci_hcd [ +0.080197] usb 1-1.1: device descriptor read/64, error -32 [ +0.191998] usb 1-1.1: device descriptor read/64, error -32 [ +0.112220] usb 1-1-port1: attempt power cycle [ +0.603594] usb 1-1.1: new full-speed USB device number 5 using xhci_hcd [ +0.000242] usb 1-1.1: Device not responding to setup address. [ +0.015739] raid6: neonx8 gen() 3822 MB/s [ +0.047993] raid6: neonx8 xor() 3573 MB/s [ +0.048019] raid6: neonx4 gen() 3222 MB/s [ +0.047989] raid6: neonx4 xor() 2784 MB/s [ +0.048005] raid6: neonx2 gen() 2733 MB/s [ +0.000192] usb 1-1.1: Device not responding to setup address. [ +0.047805] raid6: neonx2 xor() 2280 MB/s [ +0.048021] raid6: neonx1 gen() 2208 MB/s [ +0.047983] raid6: neonx1 xor() 2057 MB/s [ +0.047986] raid6: int64x8 gen() 2122 MB/s [ +0.016016] usb 1-1.1: device not accepting address 5, error -71 [ +0.032005] raid6: int64x8 xor() 1515 MB/s [ +0.048002] raid6: int64x4 gen() 1665 MB/s [ +0.011992] usb 1-1.1: new full-speed USB device number 6 using xhci_hcd [ +0.000216] usb 1-1.1: Device not responding to setup address. [ +0.035780] raid6: int64x4 xor() 1267 MB/s [ +0.048030] raid6: int64x2 gen() 1404 MB/s [ +0.047962] raid6: int64x2 xor() 931 MB/s [ +0.048050] raid6: int64x1 gen() 1093 MB/s [ +0.028148] usb 1-1.1: Device not responding to setup address. [ +0.019806] raid6: int64x1 xor() 800 MB/s [ +0.000003] raid6: using algorithm neonx8 gen() 3822 MB/s [ +0.000003] raid6: .... xor() 3573 MB/s, rmw enabled [ +0.000003] raid6: using neon recovery algorithm [ +0.003322] xor: measuring software checksum speed [ +0.036664] 8regs : 7128.000 MB/sec [ +0.039998] 32regs : 8137.000 MB/sec [ +0.040002] arm64_neon: 7339.000 MB/sec [ +0.000003] xor: using function: 32regs (8137.000 MB/sec) [ +0.001617] async_tx: api initialized (async) [ +0.066435] usb 1-1.1: device not accepting address 6, error -71 [ +0.006517] usb 1-1-port1: unable to enumerate USB device [ +0.068196] Btrfs loaded, crc32c=crc32c-generic

USB boot

No response

NVMe boot

No response

Network (TFTP boot)

No response

timg236 commented 1 year ago

What board revision is it “/proc/cpuinfo”

Unfortunately it sounds like these devices are particularly sensitive to how long USB power is off during boot so it was working by luck.

You could try increasing USB_MSD_PWR_OFF_TIME to something 5 seconds. There’s also USB_MSD_STARTUP_DELAY if you are booting from USB (delay after HC init)

I don’t think there will be a software fix since the bootloader ignores anything that’s not a USB or HID descriptor. PCIe is also reset before Linux starts so there is no state to inherit. I’m open to user configurable delays though which may help if the USB endpoint firmware doesn’t handle resets. That would explain the PID anomaly.

timg236 commented 1 year ago

Looks like this can happen on other systems. I wonder if the 8151 PID is returned by a generic TP link ROM.

https://community.ipfire.org/t/2nd-usb-ethernet-adapter-not-recognized-after-reboot-unless-replugged/8142

michael-suissa commented 1 year ago

The board revision I am aiming for is c03115, but I also have a c03111 which I can test on. They both produce the same error with the UE300. I tried the setting USB_MSD_PWR_OFF_TIME=5000 but that didn't fix it.

Strangely when I did an lsusb after a reboot with both UE200 and UE300 connected, the UE300 appeared twice (once with the product ID 8151 and once with 6001), and the UE200 didn't appear at all. When I removed the UE200, one of the UE300 connections disappeared in lsusb (the 6001). When I then plugged it back in again, nothing appeared. When I removed and reinerted the UE300, the lsusb showed the correct values (with productID 6001).

timg236 commented 1 year ago

It might be worth trying a separate powered hub for these devices. You could also try setting NET_INSTALL_ENABLED=0 - the bootloader will be looking for HID devices by default which will peturb the USB enumeration timings.

michael-suissa commented 1 year ago

Great News!

Thanks for the rapid feedback. I have done many further tests. One of them made it evident that one of my UE200 TP-link dongles had a hardware failure so I have retested things with a working one. I can definitely say that setting USB_MSD_PWR_OFF_TIME to 5000 or to 0 has no effect on the USB detection.

Setting NET_INSTALL_ENABLED=0 seems to resolve the detection for both UE300 and UE200 dongles. That is a real WIN. Thanks :-)

My next task is finding a simple way of delivering this fix to my deployment team. I have about 50 Pis to update and the people doing it now and in the future aren't very comfortable typing at the command line. Is it possible to make an image file based on the latest bootloader that contains the default config with the NET_INSTALL_ENABLED=0 option. I guess it would be like making one of the numerous options you already have (sd usb network) but it would be an sd-no-network, or something. I can see you have an automated system for producing these images. Can you add that to your list or provide me with guidance on creating my own zip file that contains an img file to be used by the Raspberry Pi Imager?

timg236 commented 1 year ago

The scripts for generating the imager release are checked into the rpi-eeprom repo. You need to run this on Linux with sudo privileges because it uses loopback filesystems

https://github.com/raspberrypi/rpi-eeprom/tree/master/imager

It's possible to use the .zips directly but there are some limitations in the ROM about max clusters sizes so it might fail with a very large capacity SD card. That's why we create the exact FAT image.

michael-suissa commented 1 year ago

I want to thank you for your help on this. I found the time to make a Jenkins job to automate the generation of a bootloader image with the right config for us. Here is the script in case others find it useful: #/bin/bash # make sure kpartx is installed if [ ! $(which kpartx) ] then sudo apt-get install -y kpartx fi git clone https://github.com/raspberrypi/rpi-eeprom.git && cd rpi-eeprom cd imager # modify the config for the default image sed -i '/^$/d' boot-conf-default.txt echo "NET_INSTALL_ENABLED=0" >> boot-conf-default.txt echo "" >> boot-conf-default.txt # make the image zips ./make-imager-release # remove the unsued ones rm release/rpi*-network.zip rm release/rpi*sd.zip rm release/rpi*usb.zip # make the disk image for the default config sudo ./make-recovery-images

I can then use the RPi Imager to select a custom image, and write the generated image (in rpi-eeprom/imager/images/) to an SD card and update the bootloader.