raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.14k stars 4.99k forks source link

xhci: Regression for cdc_acm device (bisected) #4061

Open henzef opened 3 years ago

henzef commented 3 years ago

Hi, I recently ran into issues with a serial USB device (cdc_acm driver) on a Raspberry Pi 4B 2GB Revision 1.1. I bisected this on the Hexxeh/rpi-firmware repository and f9a31df39c03911ae42e8b43b89313e25c30661c is the first bad commit (e530832 last good):

commit f9a31df39c03911ae42e8b43b89313e25c30661c (refs/bisect/bad)
Author: popcornmix <popcornmix@gmail.com>
Date:   Mon Oct 26 17:47:48 2020 +0000

    kernel: Switch to gcc-8
    See: https://github.com/raspberrypi/Raspberry-Pi-OS-64bit/issues/4

    kernel:  rpisense-fb: Set pseudo_pallete to prevent crash on fbcon takeover
    See: https://github.com/raspberrypi/linux/pull/3931

    kernel: xhci: quirks: add link TRB quirk for VL805
    See: https://github.com/raspberrypi/linux/pull/3929

    kernel: Allow the headphone output to be disabled
    See: https://github.com/raspberrypi/linux/pull/3928

As it's USB related, a strong candidate would be #3929 (pinging @P33M).

Steps to reproduce:

I have a piece of software that keeps the serial device /dev/ttyACM0 open for continuous communication. From time to time the other communication partner is resetted/restarted, which is normally detected as unplugging and replugging in dmesg:

usb 1-1.2.3: USB disconnect, device number 25
cdc_acm 1-1.2.3:1.0: failed to set dtr/rts
usb 1-1.2.3: new full-speed USB device number 27 using xhci_hcd
usb 1-1.2.3: New USB device found, idVendor=1234, idProduct=5678, bcdDevice= 1.00
usb 1-1.2.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-1.2.3: Product: MyDevice
usb 1-1.2.3: Manufacturer: Dummy
cdc_acm 1-1.2.3:1.0: ttyACM0: USB ACM device

When upgrading to commit f9a31df or newer, no unplugging is detected and the communication just appears dead, but is not closed. The unplugging and replugging is logged in dmesg as soon as my piece of software closes /dev/ttyACM0 due to timeouts.

P33M commented 3 years ago

Is the failure instant (i.e. the first time a disconnect is singalled)? Can you supply a cutdown script/program that can replicate the issue with any arbitrary cdc_acm device? My intent is to try to reproduce with a pi zero configured as a serial gadget.

henzef commented 3 years ago

The failure occurs most of the time, but not in 100% of the cases, so it might require a few tries. To reproduce it, it should suffice to open the serial port and trying to read from it in a loop. I can try to put together a minimal example tomorrow. Unfortunately I don't have a Pi0 ...

henzef commented 3 years ago

A minimal test case for my setup is the following: yes $'\xc0' | tr -d '\n' | pv -L 1k > /dev/ttyACM0 and dmesg -w in a second terminal This just sends an endless stream of 0xc0 to my device and limits its bandwidth to prevent flooding the device. I used 0xc0, because the device uses slip encoding and 0xc0 is not a violation of the protocol.

In the success case, pv exits with pv: write failed: Input/output error as soon as the cdc_acm device is restarted. In the error case, pv continues copying data without noticing the disconnect. In my current run it's at 1.9MB, lets see if it ever aborts ;-) When the error occurs, killing pv causes the kernel to notice the disconnect immediately.

Note that this method may require some more retries to trigger the error condition, but no more than 5 tries were enough for me so far.

henzef commented 3 years ago

A few more observations:

henzef commented 3 years ago

Can you reproduce it with the Pi0? If not I can try to get my hands on one

P33M commented 3 years ago

I have limited bandwidth to look at issues such as these right now. I will update the issue when I have had a chance to try reproducing it.

henzef commented 3 years ago

I reproduced the issue with a Pi0. Running the script I attached below on the pi4 works after a few attempts. Note that it behaves slightly different compared to my other serial device, as the kernel does not notice the disconnect condition even when the pv process exits.

#!/usr/bin/env bash

SERIAL_PORT="/dev/ttyACM0"

while sleep 0.1 ; do
    if [[ -e "$SERIAL_PORT" ]]; then
        yes $'\xc0' | tr -d '\n' | pv -L 1k > "$SERIAL_PORT" &
        sleep 5
        ssh pizero sudo modprobe -r g_serial
        echo -e "\ndisconnected"
        wait  # hangs here if the error occured, because line 7 does not terminate
    else
        ssh pizero sudo modprobe g_serial
        echo "connected"
    fi
done
P33M commented 3 years ago

Is the SSH connection over wifi (as in are you using a Pi0W)?

henzef commented 3 years ago

Yes

henzef commented 3 years ago

If this is not easily fixable, would it be possible to add a kernel cmdline parameter to toggle to the old behaviour? I am kind of stuck with an old kernel version here ... :-/

P33M commented 3 years ago

I've not had time to look at this. There's no way to sensibly disable a quirk once applied, as there's no sysfs node to override the value used at probe time.

Another quirk applied to the VLI controller is causing issues elsewhere (#3981) so for now I think both need to be opt-in for testing.

P33M commented 3 years ago

I've pushed a commit that should allow you to restore the pre-Oct 2020 behaviour to https://github.com/P33M/linux/tree/nobble-vli-quirks Pushing this out will cause regressions for other users, so for now can you build a kernel from this tree locally, and test it? To switch on the code path that's believed to be causing your issue, add xhci_hcd.quirks=0x10000000000 to /boot/cmdline.txt (and reboot). Testing this will ensure that only the downstream patch has caused the regression and not a combination of upstream + downstream changes.

To build and install, follow the guide here: https://www.raspberrypi.org/documentation/linux/kernel/building.md

henzef commented 3 years ago

I've installed the kernel and will test in the next few days

henzef commented 3 years ago

XHCI_EP_CTX_BROKEN_DCS and XHCI_AVOID_DQ_ON_LINK are (1<<41) and (1<<42) respectively, so I think the 0x10000000000 quirk is wrong. Or do I miss something?

P33M commented 3 years ago

Ah. Rebasing the patch in 5.10 shifted the bits up as upstream started using them. 0x200_0000_0000. and 0x400_0000_0000 respectively.

henzef commented 3 years ago

Ok, I finally had time to prepare a test setup to reproduce it quickly. I tested the following setups:

And the strange thing is, that all trigger the errorcase. Do you have an suggestion what I should try next? cherry-pick your patch the original 5.4 kernel that triggered the bug the first time? or maybe its firmware rather than kernel related?

P33M commented 3 years ago

That's confusing. If the commit in October 2020 is responsible then at least one of the quirks options you used should fix it. It's unlikely that an upstream change would have broken cdc_acm in exactly the same way as the quirk does.

Are you sure that you're booting with the correct kernel each time?

henzef commented 3 years ago

Yes, I explicitly set CONFIG_LOCALVERSION to verify that. Only the dbt and dbto files are unchanged, but these should not matter. Also

$ dmesg | grep quirk
[    1.360376] xhci_hcd 0000:01:00.0: hcc params 0x002841eb hci version 0x100 quirks 0x0000000000000890

without /boot/cmdline.txt modification should verify that your patch was indeed applied (otherwise the quirks would be 0x0000060000000890, right?)

Nathoufresh commented 3 years ago

Same issue, did anyone find a fix for this?

EDIT: Might not exactly be the same issue (though the same behaviour) as downgrading to https://github.com/Hexxeh/rpi-firmware/commit/e530832e9058ab73b0d45457aea06f0ea62a08a8 doesn't fix the issue (RPi 4).

dmesg does'nt report an unplug event when the USB device doesn't respond but it sometimes shows this error [ 128.223884] cdc_acm 1-1.1:1.0: failed to set dtr/rts

The rpi 3B+ is not affected.

P33M commented 3 years ago

Please run rpi-update and test. This issue may be related to #3981 - I see stall events when running the test script on a Pi 0W.

Nathoufresh commented 3 years ago

Th issue is still there after rpi-update on my side.

henzef commented 2 years ago

I finally managed to spend some time on more detailed testing and it turns out that my previous test (https://github.com/raspberrypi/linux/issues/4061#issuecomment-878237270) was flawed by a broken USB hub ;-( With that out of the way, I cherry-picked your patch from https://github.com/P33M/linux/commit/6867c2d672e82f6791d868591eb718647343e988 onto raspberrypi/linux@7136c8f9c83cf1d4ed5a2262cc74118672835218 and got the following results:

Where do we go from here @p33m? It would be great to have a solution that doesn't require a patched kernel (especially now that bullseye is released)

One more observation:

P33M commented 2 years ago

So using AVOID_DQ_ON_LINK breaks only if an external hub is used? Please post a full sudo lsusb -v with the hub connected.

Can you confirm that the Pi 0 testcase also triggers the issue with an external hub, and that the Pi0 connects at high speed?

henzef commented 2 years ago

So using AVOID_DQ_ON_LINK breaks only if an external hub is used?

Yes. I will double-check this tomorrow to be 100% sure

Please post a full sudo lsusb -v with the hub connected.

See the lsusb output below. Please ignore the "1a86:7523 QinHeng Electronics CH340 serial converter" device, it's unrelated

can't get debug descriptor: Resource temporarily unavailable

Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               3.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         3 
  bMaxPacketSize0         9
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0003 3.0 root hub
  bcdDevice            5.10
  iManufacturer           3 Linux 5.10.63-v7l-usbfix+ xhci-hcd
  iProduct                2 xHCI Host Controller
  iSerial                 1 0000:01:00.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x001f
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
        bMaxBurst               0
Hub Descriptor:
  bLength              12
  bDescriptorType      42
  nNbrPorts             4
  wHubCharacteristic 0x0009
    Per-port power switching
    Per-port overcurrent protection
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  bHubDecLat          0.0 micro seconds
  wHubDelay             0 nano seconds
  DeviceRemovable    0x00
 Hub Port Status:
   Port 1: 0000.02a0 5Gbps power Rx.Detect
   Port 2: 0000.02a0 5Gbps power Rx.Detect
   Port 3: 0000.02a0 5Gbps power Rx.Detect
   Port 4: 0000.02a0 5Gbps power Rx.Detect
Binary Object Store Descriptor:
  bLength                 5
  bDescriptorType        15
  wTotalLength       0x000f
  bNumDeviceCaps          1
  SuperSpeed USB Device Capability:
    bLength                10
    bDescriptorType        16
    bDevCapabilityType      3
    bmAttributes         0x02
      Latency Tolerance Messages (LTM) Supported
    wSpeedsSupported   0x0008
      Device can operate at SuperSpeed (5Gbps)
    bFunctionalitySupport   3
      Lowest fully-functional device speed is SuperSpeed (5Gbps)
    bU1DevExitLat           4 micro seconds
    bU2DevExitLat         231 micro seconds
Device Status:     0x0001
  Self Powered

Bus 001 Device 005: ID 1234:5678 My Device
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            2 Communications
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x1234 My Device
  idProduct          0x5678 
  bcdDevice            1.00
  iManufacturer           1 My Manufacturer
  iProduct                2 My Product
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0046
    bNumInterfaces          2
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xc0
      Self Powered
    MaxPower              480mA
    Interface Association:
      bLength                 8
      bDescriptorType        11
      bFirstInterface         0
      bInterfaceCount         2
      bFunctionClass          2 Communications
      bFunctionSubClass       2 Abstract (modem)
      bFunctionProtocol       1 AT-commands (v.25ter)
      iFunction               0 
    Interface Descriptor:
    can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
  bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         2 Communications
      bInterfaceSubClass      2 Abstract (modem)
      bInterfaceProtocol      1 AT-commands (v.25ter)
      iInterface              0 
      CDC Union:
        bMasterInterface        0
        bSlaveInterface         1 
      CDC Header:
        bcdCDC               1.01
      CDC ACM:
        bmCapabilities       0x0e
          connection notifications
          sends break
          line coding and serial state
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x86  EP 6 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass        10 CDC Data
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
Device Status:     0x0001
  Self Powered

Bus 001 Device 004: ID 1a40:0101 Terminus Technology Inc. Hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1a40 Terminus Technology Inc.
  idProduct          0x0101 Hub
  bcdDevice            1.11
  iManufacturer           0 
  iProduct                1 USB 2.0 Hub
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             4
  wHubCharacteristic 0x0000
    Ganged power switching
    Ganged overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood       50 * 2 milli seconds
  bHubContrCurrent    100 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0100 power
   Port 2: 0000.0100 power
can't get debug descriptor: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
   Port 3: 0000.0100 power
   Port 4: 0000.0103 power enable connect
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         0 Full speed (or root) hub
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Bus 001 Device 003: ID 1a86:7523 QinHeng Electronics CH340 serial converter
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  idVendor           0x1a86 QinHeng Electronics
  idProduct          0x7523 CH340 serial converter
  bcdDevice            2.64
  iManufacturer           0 
  iProduct                2 USB Serial
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0027
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower               98mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      1 
      bInterfaceProtocol      2 
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0020  1x 32 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0020  1x 32 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0008  1x 8 bytes
        bInterval               1
Device Status:     0x0000
  (Bus Powered)

Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.10
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x2109 VIA Labs, Inc.
  idProduct          0x3431 Hub
  bcdDevice            4.21
  iManufacturer           0 
  iProduct                1 USB2.0 Hub
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes can't get debug descriptor: Resource temporarily unavailable
can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
           3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0001  1x 1 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             4
  wHubCharacteristic 0x00e0
    Ganged power switching
    Ganged overcurrent protection
    TT think time 32 FS bits
    Port indicators
  bPwrOn2PwrGood       50 * 2 milli seconds
  bHubContrCurrent    100 milli Ampere
  DeviceRemovable    0x00
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0103 power enable connect
   Port 2: 0000.0503 highspeed power enable connect
   Port 3: 0000.0100 power
   Port 4: 0000.0100 power
Binary Object Store Descriptor:
  bLength                 5
  bDescriptorType        15
  wTotalLength       0x002a
  bNumDeviceCaps          3
  USB 2.0 Extension Device Capability:
    bLength                 7
    bDescriptorType        16
    bDevCapabilityType      2
    bmAttributes   0x00000002
      HIRD Link Power Management (LPM) Supported
  SuperSpeed USB Device Capability:
    bLength                10
    bDescriptorType        16
    bDevCapabilityType      3
    bmAttributes         0x00
    wSpeedsSupported   0x000e
      Device can operate at Full Speed (12Mbps)
      Device can operate at High Speed (480Mbps)
      Device can operate at SuperSpeed (5Gbps)
    bFunctionalitySupport   1
      Lowest fully-functional device speed is Full Speed (12Mbps)
    bU1DevExitLat           4 micro seconds
    bU2DevExitLat         231 micro seconds
  Container ID Device Capability:
    bLength                20
    bDescriptorType        16
    bDevCapabilityType      4
    bReserved               0
    ContainerID             {30eef35c-07d5-2549-b001-802d79434c30}
Device Status:     0x0001
  Self Powered

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            9 Hub
  bDeviceSubClass         0 
  bDeviceProtocol         1 Single TT
  bMaxPacketSize0        64
  idVendor           0x1d6b Linux Foundation
  idProduct          0x0002 2.0 root hub
  bcdDevice            5.10
  iManufacturer           3 Linux 5.10.63-v7l-usbfix+ xhci-hcd
  iProduct                2 xHCI Host Controller
  iSerial                 1 0000:01:00.0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0019
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         9 Hub
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 Full speed (or root) hub
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0004  1x 4 bytes
        bInterval              12
Hub Descriptor:
  bLength               9
  bDescriptorType      41
  nNbrPorts             1
  wHubCharacteristic 0x0009
    Per-port power switching
    Per-port overcurrent protection
    TT think time 8 FS bits
  bPwrOn2PwrGood       10 * 2 milli seconds
  bHubContrCurrent      0 milli Ampere
  DeviceRemovable    0x02
  PortPwrCtrlMask    0xff
 Hub Port Status:
   Port 1: 0000.0503 highspeed power enable connect
Device Status:     0x0001
  Self Powered

Can you confirm that the Pi 0 testcase also triggers the issue with an external hub, and that the Pi0 connects at high speed?

Unfortunately that can take a bit longer as I don't have the neccessary hardware here at the moment

henzef commented 2 years ago

So using AVOID_DQ_ON_LINK breaks only if an external hub is used?

Yes. I will double-check this tomorrow to be 100% sure

Double check was successful. When booting the patched kernel with xhci_hcd.quirks=0x60000000000 the issue only occurs if the cdc_acm device is connected via an USB hub (D-Link DUB-H4).

P33M commented 2 years ago

Using a Pi0W the script appears to hang after the first loop, hub or not. If I look at the g_serial module refcount - it goes negative:

g_serial               16384  -1
libcomposite           57344  2 g_serial,usb_f_acm
henzef commented 2 years ago

I got my hands on the Pi0W again and used the same SD card as in https://github.com/raspberrypi/linux/issues/4061#issuecomment-839950846 (so an old installation). I was able to reproduce the symptoms I described above after several retries (not sure how long it took exactly, but it took less than 10 minutes), so at least that testing wasn't flawed by the broken USB hub 👍 The bad news it that it seems to be an entirely different bug, because it is not fixed by disabling the quirk :-(

My /boot/config.txt contains dtoverlay=dwc2 and /boot/cmdline.txt contains modules-load=dwc2,g_serial but I don't remember if I modifed that manually.

uname -a output on the Pi0W is Linux pizero 5.10.17+ #1403 Mon Feb 22 11:26:13 GMT 2021 armv6l GNU/Linux

henzef commented 2 years ago

One more thing: The Pi0 bug also occurs without the external USB hub