raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
11.11k stars 4.98k forks source link

IN Token of USB Endpoint 1 stopped occasionally on Pi 4 #4685

Closed JulianGu001 closed 2 years ago

JulianGu001 commented 2 years ago

Describe the bug In our currently BLE project, we use the Pi4 as the BLE host, sometimes, controller's evently buffer exhausted after sent to USB layer(BLE event transfered via endpoint 1). We captured the USB infer log bug Lecroy sniffer and finding the the IN token of endpoint 1. On host side, we have confirmed that URB has been submitted to endpoint1 successfully, but no IN token sent out.

To reproduce establlish connection with a mass of BLE devices repeatlly(> 80 devices ).

Expected behaviour all success

Actual behaviour sometimes, no HCI events reported to Host, and controller reports "ALLOC_EVENT_NOBUFF", with Lecroy USB sniffer, no IN Token sent out on Endpoint 1.

System PRETTY_NAME="Raspbian GNU/Linux 10 (buster)" NAME="Raspbian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=raspbian ID_LIKE=debian HOME_URL="http://www.raspbian.org/" SUPPORT_URL="http://www.raspbian.org/RaspbianForums" BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

Logs //from the timestamp "487.569599", after the last "xhci_urb_enqueue: ep1in-intr", no IN Token sent out

-0 [000] d.s. 487.545584: xhci_queue_trb: INTR: Buffer 000000040405cb40 length 16 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] d.s. 487.545585: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep1in -0 [000] ..s. 487.546581: xhci_urb_enqueue: ep1in-intr: urb 992aeedf pipe 1077969792 slot 2 length 0/16 sgs 0/0 stream 0 flags 00010200 -0 [000] d.s. 487.546583: xhci_queue_trb: INTR: Buffer 000000040405cb40 length 16 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] d.s. 487.546584: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep1in -0 [000] ..s. 487.547600: xhci_urb_enqueue: ep1in-intr: urb 992aeedf pipe 1077969792 slot 2 length 0/16 sgs 0/0 stream 0 flags 00010200 -0 [000] d.s. 487.547605: xhci_queue_trb: INTR: Buffer 000000040405cb40 length 16 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] d.s. 487.547607: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep1in -0 [000] ..s. 487.567604: xhci_urb_enqueue: ep1in-intr: urb 992aeedf pipe 1077969792 slot 2 length 0/16 sgs 0/0 stream 0 flags 00010200 -0 [000] d.s. 487.567611: xhci_queue_trb: INTR: Buffer 000000040405cb40 length 16 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] d.s. 487.567613: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep1in -0 [000] ..s. 487.568586: xhci_urb_enqueue: ep1in-intr: urb 992aeedf pipe 1077969792 slot 2 length 0/16 sgs 0/0 stream 0 flags 00010200 -0 [000] d.s. 487.568589: xhci_queue_trb: INTR: Buffer 000000040405cb40 length 16 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] d.s. 487.568590: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep1in -0 [000] ..s. 487.569599: xhci_urb_enqueue: ep1in-intr: urb 992aeedf pipe 1077969792 slot 2 length 0/16 sgs 0/0 stream 0 flags 00010200 -0 [000] d.s. 487.569604: xhci_queue_trb: INTR: Buffer 000000040405cb40 length 16 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] d.s. 487.569605: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep1in -0 [000] .ns. 487.672186: xhci_urb_enqueue: ep2in-bulk: urb abbde4e2 pipe 3225486208 slot 2 length 0/4096 sgs 0/0 stream 0 flags 00010200 -0 [000] dns. 487.672192: xhci_queue_trb: BULK: Buffer 0000000403f82000 length 4096 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] dns. 487.672194: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep2in -0 [000] .ns. 487.755046: xhci_urb_enqueue: ep2in-bulk: urb abbde4e2 pipe 3225486208 slot 2 length 0/4096 sgs 0/0 stream 0 flags 00010200 -0 [000] dns. 487.755053: xhci_queue_trb: BULK: Buffer 0000000403f82000 length 4096 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] dns. 487.755055: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep2in Conn_Thread-843 [000] .... 488.933191: xhci_urb_enqueue: ep0out-control: urb d93b4aef pipe 2147484416 slot 2 length 0/3 sgs 0/0 stream 0 flags 00000000 Conn_Thread-843 [000] d... 488.933198: xhci_queue_trb: CTRL: bRequestType 20 bRequest 00 wValue 0000 wIndex 0000 wLength 3 length 8 TD size 0 intr 0 type 'Setup Stage' flags I:i:c Conn_Thread-843 [000] d... 488.933200: xhci_queue_trb: CTRL: Buffer c020cb1cc000200e length 3 TD size 0 intr 0 type 'Data Stage' flags I:i:c:s:i:e:C Conn_Thread-843 [000] d... 488.933201: xhci_queue_trb: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags I:c:e:C Conn_Thread-843 [000] d... 488.933203: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep0in -0 [000] .ns. 489.831245: xhci_urb_enqueue: ep2in-bulk: urb abbde4e2 pipe 3225486208 slot 2 length 0/4096 sgs 0/0 stream 0 flags 00010200 -0 [000] dns. 489.831257: xhci_queue_trb: BULK: Buffer 0000000403f82000 length 4096 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] dns. 489.831260: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep2in -0 [000] .ns. 489.881219: xhci_urb_enqueue: ep2in-bulk: urb abbde4e2 pipe 3225486208 slot 2 length 0/4096 sgs 0/0 stream 0 flags 00010200 -0 [000] dns. 489.881227: xhci_queue_trb: BULK: Buffer 0000000403f82000 length 4096 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c -0 [000] dns. 489.881229: xhci_ring_ep_doorbell: Ring doorbell for Slot 2 ep2in -0 [000] .ns. 489.931215: xhci_urb_enqueue: ep2in-bulk: urb abbde4e2 pipe 3225486208 slot 2 length 0/4096 sgs 0/0 stream 0 flags 00010200 -0 [000] dns. 489.931221: xhci_queue_trb: BULK: Buffer 0000000403f82000 length 4096 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c If applicable, add the relevant output from `dmesg` or similar. **Additional context** Have you encounter simliar problem before? Thank you Julian
P33M commented 2 years ago

Are the BLE devices being accessed via USB?

If applicable, add the relevant output from dmesg or similar.

Is there anything in dmesg?

Also, you reference a "Lecroy USB sniffer" but your post doesn't contain any further details on packets you have captured. Are there any errors reported by the tool?

JulianGu001 commented 2 years ago

Are the BLE devices being accessed via USB? Yes.

If applicable, add the relevant output from dmesg or similar. yes, we have USB host controller trace usb_trace.txt you can check the line 32106, the URB of endpoint 1 has been submitted successfully, but after that, no IN token sent out

Is there anything in dmesg? I cleared "dmesg -c" before the issue happen, after then, dmesg got nothing.

Also, you reference a "Lecroy USB sniffer" but your post doesn't contain any further details on packets you have captured. Are there any errors reported by the tool? the sniffer log file is too big, so I just filter out the point that issue happening: normally, IN token should polling every 1ms, image but, after sometimes, after submmit the URB, IN token stopped, like this picture shows: image

it seems something of HW goes wrong, please check the log and give some advice.

Thanks

JulianGu001 commented 2 years ago

By the way, we reproduce this issue easily on Pi 4, but we haven't see it happen on Pi 3 as so far.

JulianGu001 commented 2 years ago

From our analysis, after URB of Endpoint 1 submitted,the doorbell reg was ring up successfully, but the IN token did not sent out by capturing the USB sniffer log.

what kinds of reason may cause this situation?

P33M commented 2 years ago

Either the controller did not fetch the TRB on the doorbell ring, or the TRB was somehow invalid when the controller read it.

Please capture another instance of failure with a USB driver trace as before, but also dump the contents of the files (and the associated filenames) in /sys/kernel/debug/usb/xhci/0000:01:00.0/devices/02 when the first failing transfer happens - these contain the transfer rings and enqueue/dequeue pointers for the device.

JulianGu001 commented 2 years ago

Hi P33M,

Thanks for your quick response, I checked the files under the directory you mentioned, there're so many folders and files. Do you need all of them? I'm not sure if the token issue happen, when I dump thest files, the contents still keep freeze just at that failing moment, will they change dynamically? image

P33M commented 2 years ago

Ah. I was expecting at most one OUT and one IN endpoint. Please capture just the contents of ep-context and all the files in the ep02 directory.

If you can manually determine that a hang has occured, it is very likely that the endpoint state will be preserved as the transfer rings take a long time to wrap around and overwrite past data.

JulianGu001 commented 2 years ago

Hi Sir,

I reproduced this issue twice and upload the log here, please have a check : you can address this issue point around the last "xhci_urb_enqueue: ep1in-intr:" , which should not stop, because endpoint 1 shuold keep polling every 1ms. 1.) usb_trace_001.txt xhci_debug.zip

2.) xhci_debug.zip usb_trace_002.txt

JulianGu001 commented 2 years ago

Hi Sir, I checked the contents of /sys/kernel/debug/usb/xhci/0000:01:00.0/devices/02/ep02 in the xhci_debug.zip which was dumpped when Endpoint 1 IN token stopped, found that: image but in the trbs: image Is that normal?

BR Thanks

P33M commented 2 years ago

I did notice that, but it's possible the TRB was turned into a no-op by error recovery operations on the ring.

It would be useful if you could try several more times to replicate the hang, and see where ep02's dequeue pointer is each time - is it always just after a Link TRB?

Also please post a full sudo lsusb -v of the BLE adapter.

JulianGu001 commented 2 years ago

I did notice that, but it's possible the TRB was turned into a no-op by error recovery operations on the ring. ok, got it

It would be useful if you could try several more times to replicate the hang, and see where ep02's dequeue pointer is each time - is it always just after a Link TRB? Yes, we have reproduced this issue so many times, the dequeue pointer is always after a LINK TRB.

Also please post a full sudo lsusb -v of the BLE adapter. the lsusb infor attached: usb.zip

zhuhangtian commented 2 years ago

pi4-usb-vl805-IN-token-issue-log-5segs.zip

Some more logs about this issue, tried allocate 5 segments for the endpoint (interrupt IN) instead of 2, the stuck trb is always the first trb of ring last segment (when 2 segments, it's also the first trb of 2nd segment). for example, this endpiont allocated: segment0->trbs dma addr: 0x41b61d000 segment1->trbs dma addr: 0x41b61c000 segment2->trbs dma addr: 0x41b61b000 segment3->trbs dma addr: 0x41b61a000 segment4->trbs dma addr: 0x41b619000

when the issue happens, dump the ring deq pointer: image the buffer 0(No-op) is set AFTER issue happens when stopping ep in td_to_noop() function, it's not the cause of the issue. trb deq pointer: 0x41b619000, the 1st trb of last segment.

added some logs in xhci-hcd. image at the issue point, xhci set doorbell register, ring->enqueue trb is 0x41b619000, and the buffer(field1 field0) is 402b67e40

ftrace logs: image xhci ring doorbell for ep1in, but no more interrupt for ep1 anymore, since host controller is not sending IN token.

P33M commented 2 years ago

There's a known bug in the Via controller where it can't handle a SET_DEQUEUE_POINTER command on a stopped ring where the TRB pointed to is a Link TRB. The ring isn't being stopped in this case, it should just be "parked" as it's an interrupt ring with no assigned TRBs. Ringing the doorbell should cause the controller to advance past the Link TRB and read the (now valid) first TRB in the next ring segment.

The same code flow is used for both bulk and interrupt traffic, and HID behaves the same way as your driver (submit a single URB at a time, which translates to a single active TRB on the ring), so why don't keyboards etc stop working?

As an experiment, what happens if you constrain the length of the buffer you use in your URB to exactly the maxpacket size of the endpoint - 16 bytes?

JulianGu001 commented 2 years ago

There's a known bug in the Via controller where it can't handle a SET_DEQUEUE_POINTER command on a stopped ring where the TRB pointed to is a Link TRB. The ring isn't being stopped in this case, it should just be "parked" as it's an interrupt ring with no assigned TRBs. Ringing the doorbell should cause the controller to advance past the Link TRB and read the (now valid) first TRB in the next ring segment.

Could this bug cause USB host to stop sending IN token of Interrupt Endpoint?And if it has been resolved in the latest kernel? We have updated the kernel ot 5.10.77, but we can still reproduce this issue.

The same code flow is used for both bulk and interrupt traffic, and HID behaves the same way as your driver (submit a single URB at a time, which translates to a single active TRB on the ring), so why don't keyboards etc stop working?

As an experiment, what happens if you constrain the length of the buffer you use in your URB to exactly the maxpacket size of the endpoint - 16 bytes? The maxpact size of Enpoint 1 currently is 16Byte, For BLE, this value could be confiured to be 64Bytes, we have tied this, but no help.

P33M commented 2 years ago

There's a known bug in the Via controller where it can't handle a SET_DEQUEUE_POINTER command on a stopped ring where the TRB pointed to is a Link TRB. The ring isn't being stopped in this case, it should just be "parked" as it's an interrupt ring with no assigned TRBs. Ringing the doorbell should cause the controller to advance past the Link TRB and read the (now valid) first TRB in the next ring segment.

Could this bug cause USB host to stop sending IN token of Interrupt Endpoint?And if it has been resolved in the latest kernel? We have updated the kernel ot 5.10.77, but we can still reproduce this issue.

I'm saying there's a possibility that this is a variant of an existing bug, but the existing bug is not in evidence in this case.

One thing I have noticed that is unusual: the transfer ring for EP02 has 5 segments, so it has grown in size from the default 2. Edit: I see you have noticed the same. Can you instrument the xhci_mem.c : xhci_ring_expansion() function to see if this is being called repeatedly?

There should be no reason for a low-bandwidth ring like this one to get expanded.

zhuhangtian commented 2 years ago

The 5 segments for EP02 is experimental change on xhci_endpoint_init(), allocate 5 segments for ep_index 2, type 3(INTR). To prove the symptom of this issue is: host stop sending IN token for the first trb of ring last segment (xhci_ring_expansion() is not called).

attach logs for default 2segments case: syslog_2segs_default.zip trace_2segs_default.zip 2segs-default.zip

segment0 dma addr: 41b617000 segment1 dma addr: 41b616000

dequeue pointer when issue happens: image

kernel log: image

ftrace: image

BTW, I noticed that xhci spec says there are 244 trbs used in the 2nd segment of the transfer ring, but xhci driver use 255 trbs for each segment always. image

I modified segment size to 128: diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 38650e4a9..6336cd16c 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1510,7 +1510,7 @@ static inline const char *xhci_trb_type_string(u8 type)

seems the issue cannot be replicated, but still hard to explain why, attach log for 128 segment size: syslog_128_trbs_per_seg.zip trace_128_trbs_per_seg.zip 128-trbs-per-seg-no-issue.zip

zhuhangtian commented 2 years ago

For the "SET_DEQUEUE_POINTER", I added log and replicated with default: 256trbs*2segs, the pointer is set correctly.

log: set-deq-ptr.zip

before starting test, there is ep stop, and set EP02 deq pointer command: image

and set deq pointer completion: the deq pointer is moved to 41b616630 image

then till the issue happens, there is no set deq pointer command anymore. image

P33M commented 2 years ago

The Set Deq Pointer command can only be issued on a stopped endpoint. It shouldn't be used when the endpoint is running. The absence of any commands in the time between starting your test and the bug occurring is normal behaviour, so the previous bugfix will have no effect here.

What is the total system load when performing the test? Are CPUs heavily loaded? Is there significant Ethernet activity?

The fact that reducing the segment size to 128 prevents the bug occurring is interesting. This means that both segments can fit inside a single 4k page, and this is the case in your xhci dumps in 128-trbs-per-seg zip. This will have several effects but one of these could be a concurrency issue.

In updating the xHC's ring state before writing to the respective doorbell, the host has to ensure that the updated TRB is visible to the controller. The doorbell write goes via a separate path to the memory write, so there's a potential for a race condition if ordering is not sufficiently guaranteed by the placement of write barriers in the driver.

Please try the attached patch which will force a time delay between modifying the TRB and ringing the doorbell.

Apply to a rpi-5.10.y kernel tree with patch -p1 < 0001-xhci-Add-doorbell-delay-parameter-to-throttle-writes.patch

Can you adjust the /sys/module/xhci_hcd/parameters/db_delay module parameter to a reasonable number (range 0-50) to see if the issue disappears?

0001-xhci-Add-doorbell-delay-parameter-to-throttle-writes.zip

zhuhangtian commented 2 years ago

Will test this patch tomorrow morning.

The fact that reducing the segment size to 128 prevents the bug occurring is interesting. This means that both segments can fit inside a single 4k page, and this is the case in your xhci dumps in 128-trbs-per-seg zip. This will have several effects but one of these could be a concurrency issue.

With 128 trbs per segment, the two segments are still in different pages. segment0 dma addr: 41b617000 ~ 41b6177e0 segment1 dma addr: 41b616000 ~ 41b6167e0

but leave 2K unused in each page. Actually I also tested more values of TRBS_PER_SEGMENT, such as 252, which I also can't replicate the issue.

What is the total system load when performing the test? Are CPUs heavily loaded? Is there significant Ethernet activity?

Not high loading if I remove the logs. There will be some traffics on Ethernet, can provide more info tomorrow.

In updating the xHC's ring state before writing to the respective doorbell, the host has to ensure that the updated TRB is visible to the controller.

I've printed the trb content right before writing doorbell register. I also added a timer with 50ms timeout when the issue hits, in it's callback read trb content, it's still the same as before writing the register.

The doorbell write goes via a separate path to the memory write, so there's a potential for a race condition if ordering is not sufficiently guaranteed by the placement of write barriers in the driver.

you mean fill trb and writing doorbell register? they should be in same hi tasklet context (bh->bh) while urb->complete() called usb_submit_urb(). furthermore, I tested single CPU case by add "maxcpus=0" in /proc/cmdline.txt, the issue happens.

Thanks for the reply.

P33M commented 2 years ago

you mean fill trb and writing doorbell register? they should be in same hi tasklet context (bh->bh) while urb->complete() called usb_submit_urb(). furthermore, I tested single CPU case by add "maxcpus=0" in /proc/cmdline.txt, the issue happens.

The TRB manipulation and doorbell write happen under the xhci spinlock and on a single CPU. The stores to memory and the controller register have no firm guarantee of ordering once they have left the ARM bus fabric. Adding a small but constant delay between updating the memory context and writing to the doorbell will be diagnostic - it will have the effect of enforcing ordering.

zhuhangtian commented 2 years ago

50us 100us 200us added before writing dbaddr still replicate this issue. 50us.zip 100us.zip 200us.zip

50us case: image 36.270974 to 36.271035 there is 61us (there is 10~12us delay between writing db and urb->complete() exit original)

100us case: image 61.049068 to 61.049178 there is 110us delay

200us case: image 743.019029 to 743.019238 there is 209us delay

zhuhangtian commented 2 years ago

stats while testing(remove logs):

image

P33M commented 2 years ago

OK, that rules out one hard-to-debug cause and implies that the state is static in the window where the xhci controller "hangs".

Can you capture the xhci registers and ring states in the 5-second window where no IN tokens are being issued but before the xhci driver resets the endpoint?

Can you add a timeout in your user code of ~10ms for the expected URB completion, and somehow trigger your script to capture the XHCI ring state at the right time? In particular, I want to know the contents of the TRB that the controller stalls on (that reads back as no-op after the xhci driver resets the endpoint) and the contents of the endpoint context.

Meanwhile I have not been able to reproduce the issue here. I'm using a USB2.0 network dongle, which has an IN Interrupt endpoint with approximately the same properties as the one on your device. Even if I force the endpoint to report Short Packets, or artificially decrease the polling interval to 2 microframes, I never get a hang.

P33M commented 2 years ago

I get temporary pauses of ~2 seconds in ring execution if

However the pauses occur at random places on the ring, not restricted to the first TRB of a ring segment. So missing a service interval (in your case, taking more than 1ms to resubmit the URB) should be benign.

zhuhangtian commented 2 years ago

dump_regs.zip

dumping regs with interval 1s in script without stopping ep1 after issue hits

image issue hits around 1161 s last data packet starts with 78 79 7a 7b you can find dequeue ptr dump_xhci_regs_1162.08.txt later on logs image

usb sniffer, last ep1in data packet starts with 78 79 7a 7b, there are more packets on ep2in(bulk), but no IN token sent on ep1 image

zhuhangtian commented 2 years ago

summarize observations on this issue:

  1. IN token always stuck on lowest dma addr trb, no matter 2 segments or 5 segments (segment address descending: 41b617000 41b616000 etc.), or reverse link segments by experimental change on xhci_alloc_segments_for_ring() make segments addr ascending (first alloc, last segment. 41b616000 41b617000). reverse link segments: image xhci: image
  2. dma_pool_create() and TRBS_PER_SEGMENT:

cat /sys/devices/platform/scb/fd500000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/pools

size 4K align 4K boundary 4K: not good --- default 256 xHCI ring segments 45 45 4096 45 xHCI ring segments size 1000 allocation 1000 boundary 1000 pages 45 --- debug log segment0: [41b617000, 41b617fe0, 41b617ff0(link trb)] segment1: [41b616000, 41b616fe0, 41b616ff0(link trb)] coherent boundary

size 2K align 2K boundary 4K: good --- TRBS_PER_SEGMENT 128, tested hours, not replicate xHCI ring segments 45 90 2048 45 ring segments size 800 allocation 1000 boundary 1000 pages 45 segment0: [41b617000, 41b6177e0, 41b6177f0(link trb)] segment1: [41b616000, 41b6167e0, 41b6167f0(link trb)]

size 2K align 4K boundary 4K: good --- TRBS_PER_SEGMENT 128, modify align 4K, not enough longterm test xHCI ring segments 45 45 4096 45 xHCI ring segments size 1000 allocation 1000 boundary 1000 pages 45 segment0: [41b617000, 41b6177e0, 41b6177f0(link trb)] segment1: [41b616000, 41b6167e0, 41b6167f0(link trb)]

size 2K align 2K boundary 2K: not good --- seems even with 128, if 2 segments with coherent boundary, issue hits xHCI ring segments 45 46 2048 23 xHCI ring segments size 800 allocation 1000 boundary 800 pages 23 segment0: [41b602000, 41b6027e0, 41b6027f0(link trb)] coherent boundary segment1: [41b602800, 41b602fe0, 41b602ff0(link trb)]

P33M commented 2 years ago

Thank you for the debug output - it appears you have captured the critical window.

However the /sys/kernel/debug/usb/xhci/0000:01:00.0/devices/02/ep02/dequeue file is a software-maintained inference of the dequeue pointer updated by hardware. Please capture the register dump again, but include the contents of /sys/kernel/debug/usb/xhci/0000:01:00.0/devices/02/ep-context on each iteration. This is the output from the actual hardware and I need to see what address the controller last read. Did it advance past the last Link TRB on a doorbell ring, or did it halt on the Link TRB?

When you say the ring crosses a coherent boundary, what boundary is this? Do you mean adjacent cache lines?

Edit: Please also share the patch you used to reverse the link order of the ring segments.

P33M commented 2 years ago

I found that the ordering of segments can be reversed if my device is unplugged and plugged in several times. With a link TRB that does not have the Toggle bit set and pointing to a segment with a lower DMA address, I still don't get a hang when replicating your endpoint setup (1ms poll rate, endpoint responds with 16B packets, single URB submitted at a time).

Your test fails after the link TRB has been traversed some 2000 times - is there anything that makes the failure more likely to happen such as a particular type of system load?

It would also be useful to know what flags you are passing to usb_submit_urb - can you share the function call where you first start the interrupt transfer, and the URB's callback function?

zhuhangtian commented 2 years ago

dump_regs_ep_context.zip

image

issue happens at 788 s

you can find reg dump log: dump_xhci_regs_789.97.txt and later logs

zhuhangtian commented 2 years ago

syslog.zip the flags is always 0x10200, URB_DMA_MAP_SINGLE(by usb_hcd_map_urb_for_dma()) | USB_DIR_IN (by usb_submit_urb()) below is first urb, the code is nothing special: set urb->transfer_flags/actual_length/status to 0, then usb_fill_int_urb() with bInterval, and usb_submit_urb() image

P33M commented 2 years ago

It took between 3-10 hours but my Ethernet dongle shows the same symptoms as your BLE dongle. The Interrupt transfers stalled in the same manner - after a Link TRB to a lower DMA address.

If I manually ring the slot doorbell with the endpoint index, then transfers resume as normal.

So now we need to find a way of causing the bug to happen faster than the ~hours in my case and the 10-15 minutes in your case.

P33M commented 2 years ago

I have captured the failure with associated PCIe traffic and something strange is happening. For several seconds prior to the Interrupt IN endpoint failure, the controller is repeatedly polling the lowest DMA address of a Bulk endpoint ring segment at a 1us interval, causing significant bus traffic. It may be that the combination of a Bulk IN or OUT endpoint needs to be active at the same time as an Interrupt IN, and the broken polling of the Bulk endpoint is required, and this seems to be the case with my Ethernet dongle test - if the Ethernet cable is disconnected, there are no bulk packets and the Interrupt endpoint so far has not stalled.

It looks like the controller haphazardly reads the Interrupt IN trb that is associated with the stall throughout the broken period of poilling but never signals completion.

P33M commented 2 years ago

Please try the following experiment:

Boot the Pi with no USB devices plugged in. Issue the command sudo setpci -s 01:00.0 0xd4.B=0x40 to turn off link power saving and verify that ASPM is disabled with sudo lspci -s 01:00.0 -vvv | grep LnkCtl. Replicate the test conditions that would give you a failure, and see if the Interrupt IN endpoint stalls.

zhuhangtian commented 2 years ago
  1. boot up with no USB device plugged in
  2. sudo setpci -s 01:00.0 0xd4.B=0x40
  3. sudo lspci -s 01:00.0 -vvv | grep LnkCtl image
  4. plug USB BLE device, start test
  5. replicated the issue same. image image
P33M commented 2 years ago

Latest rpi-update firmware contains a potential fix for this issue. Please test and report back.

JulianGu001 commented 2 years ago

Hi

We have updated the kernel to 5.10.77-v7l and still reproduced this issue. Could you please tell us the commit ID for the fix you mentioned, then we can double confirm if our kernel has contained the fix?

popcornmix commented 2 years ago

The kernel you want is 5.10.82-v7l+ Did you use rpi-update to update?

JulianGu001 commented 2 years ago

No, we down kernel with: git clone --depth=1 https://github.com/raspberrypi/linux

build and update it.

Is it not the lastest kernel? If not what method you suggest to update our Pi to the latest kernel ?

JamesH65 commented 2 years ago

It won't be in the release kernel yet, try

sudo rpi-update

to get the latest test release. It may have other changes.

JulianGu001 commented 2 years ago

Thanks very much, may I know the commit ID about the fix? We're very curious about the root cause for this issue.

P33M commented 2 years ago

https://github.com/raspberrypi/linux/commit/a1d0f808d4a0b7f7053095cd4ab97a4276bed9ff

The commit text details the observed symptoms down to individual reads that the controller performs, but without detailed knowledge of the internal architecture we won't find a root cause as such.

JulianGu001 commented 2 years ago

Hi

I have update our Pi to "5.10.82-v7l+" But it's failed to compile our USB driver with error: image

Seems it's lack of kernel header, I used the command "sudo apt install raspberrypi-kernel-headers", but not work

zhuhangtian commented 2 years ago

my very first debug item on this issue is also trying to alloc single segment for the endpoint, but there will be xhci_ring_expansion() while num_trbs_in_deq_seg is 254 in room_on_ring(), so there still will be 2 segments(dma addr separated between 2 segs, but other endpoint segments pages may still be "coherent" with these 2 segs) running for endpoints, same issue will happen.

image

segment0 addr: 41b607000 segment1 addr: 41b003000 (expansion)

issue happens at segment0 first trb: image

Please consider use this quirk(or maybe new macro?) to select 128 trbs per segment which we already tested for long term. use 128 trbs, each segment page will leave 2K unused, all endpoints segments are "isolated" (not know why this xHC need this) image

Please also consider xhci spec about segment1 (not sure if HW has other arrangements for the space of seg1 tail): image

P33M commented 2 years ago

my very first debug item on this issue is also trying to alloc single segment for the endpoint, but there will be xhci_ring_expansion() while num_trbs_in_deq_seg is 254 in room_on_ring(), so there still will be 2 segments(dma addr separated between 2 segs, but other endpoint segments pages may still be "coherent" with these 2 segs) running for endpoints, same issue will happen.

The other patch in the linked merge request addresses this.

Please consider use this quirk(or maybe new macro?) to select 128 trbs per segment which we already tested for long term. use 128 trbs, each segment page will leave 2K unused, all endpoints segments are "isolated" (not know why this xHC need this)

Please also consider xhci spec about segment1 (not sure if HW has other arrangements for the space of seg1 tail)

Altering the constant for the number of TRBs in a segment will have multiple effects for address calculation everywhere, and will shrink the size of rings that really shouldn't be shrunk - the event ring and USB3.0 stream rings in particular. The spec diagrams are informative, not normative. It is fine to have a single segment with 254 TRBs in.

Hi

I have update our Pi to "5.10.82-v7l+" But it's failed to compile our USB driver with error: image

Seems it's lack of kernel header, I used the command "sudo apt install raspberrypi-kernel-headers", but not work

Rpi-update firmware does not ship headers. As you're testing an out-of-tree driver anyway, you need to the compile kernel and headers on the Pi itself and install them manually. https://www.kernel.org/doc/Documentation/kbuild/headers_install.txt

pelwell commented 2 years ago

You may find our rpi-source utility useful for building modules to match an rpi-update kernel: https://github.com/RPi-Distro/rpi-source

zhuhangtian commented 2 years ago

Did your device not replicate the issue? Here it happens the same way, now only 1 segment for each endpoint ring. image

image

JulianGu001 commented 2 years ago

sorry, this issue still exist, closed it by mistake operation

P33M commented 2 years ago

I have managed to repeat the hang but after a much longer period of testing. I would guess that the error condition is merely harder to hit.

With a TRBS_PER_SEGMENT of 128, the behaviour of the bulk IN endpoint is still strange after the Link TRB is read and the ring wraps to the lower address. The controller still continuously polls the lowest address in the segment which leads me to believe there could still be a collision between endpoint fetches when the broken polling is still occurring. Also, if I specify a segment size of 4K and shift the start of a segment up to a 2k boundary, the controller still has broken polling.

I'd want to explain why the controller is repeatedly fetching TRBs for a ring where there are many spare buffers (usbnet submits 10+ URBs at once).

P33M commented 2 years ago

The controller reads off the end of a ring segment when prefetching TRBs. busreads This is the Bulk IN ring activity immediately before the Link TRB is fetched. The controller fetches 12 Dwords (3 TRBs) starting at 0x41ad4afd0 and then immediately fetches 16 dwords starting at 0x41ad4b000 - which is in the next page and in the ring segment for the IN endpoint.

My bet is the controller does not tag fetched addresses with endpoint number, so when the AGU for the Interrupt IN endpoint requests data for the start of the transfer ring, it insteads hits the cached TRBs fetched by the Bulk IN endpoint activity.

This is why padding either the start or end of the transfer segments prevents the bug from occurring - over-reads hit unallocated memory which is never referenced by another dequeue pointer.

I am testing a workaround that doesn't require wasting 50% of all memory allocated for transfer rings - if I put the Link TRB at position 239 instead of 255 in a segment, then a 16 dword readahead should not read off the end of a page.

zhuhangtian commented 2 years ago

Thanks for the findings, I was thinking implement 256+244 changes and verify locally.

Was pulled out for other stuffs in recent 7 days, so no more clues and debug on this issue.