Closed xobs closed 4 years ago
More information:
Here's a gdb trace of the last 7 queued transfer sizes and endpoints, along with the order in which xfer_complete()
was called:
(gdb) p xfer_log_offset
$38 = 52 '4'
(gdb) p queue_log_offset
$39 = 8 '\b'
(gdb) p/x xfer_log[45]@7
$40 = {{
ep_num = 0x83,
size = 0x200
}, {
ep_num = 0x83,
size = 0xd
}, {
ep_num = 0x3,
size = 0x1f
}, {
ep_num = 0x83,
size = 0x200
}, {
ep_num = 0x3,
size = 0x1f
}, {
ep_num = 0x83,
size = 0xd
}, {
ep_num = 0x83,
size = 0x200
}}
(gdb) p/x queue_log[0]@7
$41 = {{
ep_num = 0x83,
size = 0x200
}, {
ep_num = 0x83,
size = 0xd
}, {
ep_num = 0x3,
size = 0x1f
}, {
ep_num = 0x83,
size = 0x200
}, {
ep_num = 0x83,
size = 0xd
}, {
ep_num = 0x3,
size = 0x1f
}, {
ep_num = 0x83,
size = 0x200
}}
(gdb)
Additionally, here is a trace:
The last thing tinyusb tried to do was transfer 499
bytes.
The order of dcd_xfer_complete
is always important and it should respect the order of actual packet transferred in the bus. Host won't send the next scsi command (2) before it receive the status of current command (1). Why would your port call the dcd_xfer_complete
for the (2) before (1) while the actual event on bus is (1) then (2).
The IRQ is shared by both IN
and OUT
transfers, so in the ISR it first polls https://rm.fomu.im/usb.html#usb-out-ev-pending to see if there's an OUT
event pending, and then it polls https://rm.fomu.im/usb.html#usb-in-ev-pending to see if there's an IN
event. In this case, the OUT
event is short enough that both bits are set when the ISR runs, so it doesn't know which came first.
Perhaps the correct solution is to add a bit to the hardware to determine which packet came first -- IN
or OUT
-- and process that one first.
This sounds like how STM32 FSDEV operates. In it, all of the EP have separate single-packet buffers. There is no way to know which packet was transmitted first across endpoints, through there is some extra logic for EP0 relating to setup packets. There is no way to know if a IN or OUT packet came first. The class driver must be written to know in what order to read/write the endpoints. I've noticed that the USBTMC class seems to issue BULK IN requests before the BULK OUT transfer requesting the data. (the BULK IN is NAKed until the BULK OUT is processed).
In the interrupt, one should process SETUP/EP0 first (since it may ask for an EP to be halted), and then move onto the various other EP.
(I'm not familiar with MSC, so I'm not sure what the issue here is, or if comments are helpful)
The IRQ is shared by both
IN
andOUT
transfers, so in the ISR it first polls https://rm.fomu.im/usb.html#usb-out-ev-pending to see if there's anOUT
event pending, and then it polls https://rm.fomu.im/usb.html#usb-in-ev-pending to see if there's anIN
event. In this case, theOUT
event is short enough that both bits are set when the ISR runs, so it doesn't know which came first.Perhaps the correct solution is to add a bit to the hardware to determine which packet came first --
IN
orOUT
-- and process that one first.
Indeed, it could be an issue if ISR isn't running fast enough. This is particular with MSC class driver, at the end of data phase, it will dcd_xfer an status phase (13 bytes), then dcd_xfer the next command phase right away. https://github.com/hathach/tinyusb/blob/master/src/class/msc/msc_device.c#L554 https://github.com/hathach/tinyusb/blob/master/src/class/msc/msc_device.c#L571
Yeah, IRQ may not run immediately depending on the mcu and rtos, maybe something is running in critical context. So this is a valid scenario, I will make a change to msc driver on separate for you to test with.
@pigrew yeah control normally has it own way to handle setup and/or status. This happens since IN + OUT combine to have an sequence of SCSI command. Most of the time the IN & OUT are relatively Independence to each other. This only happens since msc class driver queue out transfer right after an IN without waiting for its completion. Your involvement is always helpful :)
Thanks for the feedback! I added a register to indicate which type of packet should be handled next, and that seems to solve the problem with the current setup: https://rm.fomu.im/usb.html#usb-next-ev
@xobs wow, you are designing an FGPA hardware, I was thinking you are porting the stack to an mcu. Awesome :smile:
Turns out tinyusb is a great way to find all sorts of implementation bugs. But it's always a question of where the bug lies -- is it in the FPGA hardware, the DCD driver, or in tinyusb? This is the first issue I've found in tinyusb so far, and it actually ended up being a hardware issue!
This is also a riscv port, which turned out to not be the difficult part at all! https://github.com/xobs/tinyusb/tree/valentyusb-eptri
@xobs this is not really entirely hardware issue, tinyusb shouldn't rely on ISR running fast enough to work. Like @pigrew mentioned, most of the DCD only has register indicating which endpoint is complete, and not their order of completion. If the ISR function latency is long enough ( 1ms for Fullspeed, 125us for Highspeed), there is no way to tell the order. TinyUSB aims to support all ports, so It shouldn't rely much on hardware :).
This is also a riscv port, which turned out to not be the difficult part at all! https://github.com/xobs/tinyusb/tree/valentyusb-eptri
This is great to hear, which risv and/or fpga board you are running on, I may try to find one to have some tests as well on these platforms.
This is running on a Vexriscv CPU in an ICE40UP5k inside a Fomu. I'm using Fomu EVT so I can attach a debugger to the softcore to debug the CPU -- normally we run the debugger over USB, but that doesn't work so well when the USB core isn't stable!
This is running on a Vexriscv CPU in an ICE40UP5k inside a Fomu. I'm using Fomu EVT so I can attach a debugger to the softcore to debug the CPU -- normally we run the debugger over USB, but that doesn't work so well when the USB core isn't stable!
sound like It needs a bit of learning curve to work on these platforms :D. Thanks for using tinyUSB, I am glad to hear you didn't find that many bugs from the project lol :laughing:
@xobs Let's give this a try to see if this patch #208 fixes your issue
With the addition of https://github.com/im-tomu/valentyusb/commit/2f240d9c8a5c9aa0adfe96799b7df47942f2c5c0 (and https://github.com/im-tomu/valentyusb/commit/d615069b45535b10464b7ade725de1aa9ab25e11 after doing some more thinking), I'm able to use the msc example without issue without #208, even if I'm running unoptimized debug code from extremely slow flash.
However, #208 seems like a good idea, especially the unit tests, so I think you should merge it anyway.
@xobs thanks for your feedback. The unit test take more time to write than the actual code. The code changes is easy enough, but I take the chance to write unit test since those are nice to have. Plan to write more unit tests in the future when having time.
It appears as though the order that irq events are handled matters, and this order is undocumented.
I'm porting tinyusb to a new platform, and right now it's running slowly. This architecture can handle IN and OUT endpoints at the same time, and as a result its possible to handle both inside the isr (provided it's running slowly enough).
As a result, I get odd behavior in examples such as the MSC example. The following sequence of events appears to trigger the issue:
My ISR and processes OUT packets first, since I thought OUT and IN endpoints were separate and order doesn't matter. However, it appears to matter at least in this case.
During (2),the Device queues 512 bytes of data to be sent to the Host. This blocks, since the IN packet from (1) hasn't been finished. Then, execution continues and we process IN packets. The Device sees that the 13 byte transfer from (1) has finished, so it calls xfer_complete.
The odd behavior is that tinyusb will now queue 499 bytes to be sent, on top of the 512 bytes it's already sent.
Does the order matter? Should that be documented anywhere? I notice that many drivers handle OUT packets last, but is that merely coincidence or is it a requirement?
I will post traces and logs as soon as my home Internet connection gets fixed.