dmitrystu / libusb_stm32

Lightweight USB device Stack for STM32 microcontrollers
Apache License 2.0
707 stars 160 forks source link

F405 BULK IN fails when length > 4 #73

Closed jhbe closed 3 years ago

jhbe commented 3 years ago

My F405 firmware sends bytes over a BULK IN endpoint to a PC every 10ms. If it send more than four bytes, then after a few hundred successful sends the USB pipe stops with a PROTOCOL_ERROR; see the attached pcap file line 470. This was captured on a Debian 10 host. The device is a Matek F405STD board.

I'm using the usbd_stm32f429_otgfs "driver". The otgfs BULK endpoint is not in a stalled state, according to ep_isstalled(). The device is not stuck. The protocol error is sent to the host exactly when I would expect the normal byte transfer to take place, so it seems the call to usbd_ep_write() caused the condition. My EP1 size is 64, as is my EP0.

I am at a loss as to what to look at next, please advise. The device and host code can be found here.

dmitrystu commented 3 years ago

Hi.

Tried to build and test your code. I built and upload usbtest-device.axf then built and run usbtest-host. Can't reproduce the problem. Setup:

lsusb ``` >uname -vro 4.15.0-124-generic #127-Ubuntu SMP Fri Nov 6 10:54:43 UTC 2020 GNU/Linux >lsusb -v -d3456:1150 Bus 002 Device 010: ID 3456:1150 Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x3456 idProduct 0x1150 bcdDevice 1.00 iManufacturer 1 Johan iProduct 2 F405 UsbTest iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 32 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 255 Vendor Specific Protocol iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 1 Device Status: 0x0000 (Bus Powered) ```
Wireshark ``` No. Time Source Destination Protocol Length Info 51 2.547974 host 2.0.0 USB 64 SET ADDRESS Request 53 2.575987 host 2.10.0 USB 64 GET DESCRIPTOR Request DEVICE 54 2.576115 2.10.0 host USB 82 GET DESCRIPTOR Response DEVICE 55 2.576157 host 2.10.0 USB 64 GET DESCRIPTOR Request CONFIGURATION 56 2.576391 2.10.0 host USB 73 GET DESCRIPTOR Response CONFIGURATION 57 2.576429 host 2.10.0 USB 64 GET DESCRIPTOR Request CONFIGURATION 58 2.576618 2.10.0 host USB 96 GET DESCRIPTOR Response CONFIGURATION 59 2.576660 host 2.10.0 USB 64 GET DESCRIPTOR Request STRING 60 2.576891 2.10.0 host USB 68 GET DESCRIPTOR Response STRING 61 2.576929 host 2.10.0 USB 64 GET DESCRIPTOR Request STRING 62 2.577119 2.10.0 host USB 90 GET DESCRIPTOR Response STRING 63 2.577158 host 2.10.0 USB 64 GET DESCRIPTOR Request STRING 64 2.577393 2.10.0 host USB 76 GET DESCRIPTOR Response STRING 65 2.577615 host 2.10.0 USB 64 SET CONFIGURATION Request 66 2.577745 2.10.0 host USB 64 SET CONFIGURATION Response 277 45.364418 host 2.10.1 USB 64 URB_BULK in 278 45.364583 2.10.1 host USB 84 URB_BULK in 279 45.364674 host 2.10.1 USB 64 URB_BULK in 280 45.371472 2.10.1 host USB 84 URB_BULK in ........ 1297 50.351658 host 2.10.1 USB 64 URB_BULK in 1298 50.361577 2.10.1 host USB 84 URB_BULK in 1299 50.361667 host 2.10.1 USB 64 URB_BULK in 1300 50.371580 2.10.1 host USB 84 URB_BULK in ........ 4968 68.261995 2.10.1 host USB 84 URB_BULK in 4969 68.262086 host 2.10.1 USB 64 URB_BULK in 4970 68.271996 2.10.1 host USB 84 URB_BULK in 4971 68.272087 host 2.10.1 USB 64 URB_BULK in 4972 68.277622 2.10.1 host USB 64 URB_BULK in ``` Last packet with ESHUTDOWN URB status (Program was interrupted)
dmitrystu commented 3 years ago

From my experience, if your device has a strange behavior, check the device power first for unexpected spikes and drops.

jhbe commented 3 years ago

Huge thanks for trying the code!

Swapped the USB cable for another and powered the device from a bench power supply; no change.

I did notice that, with constant power to the device, it was enough to pull the USB cable and put it back in again on the PC side to get the device to enumerate and send another few hundred strings before it failed as before. What does libusb_stm32 do the the cable is pulled and put back in? Something is clearing some state somewhere...

dmitrystu commented 3 years ago

In general, the STM USB HW engine doesn't support a special disconnect event, so no special actions performed when the cable unplugged. When you plug the cable, the host will issue "BUS RESET" first and the USB engine will be resetted and reconfigured.

jhbe commented 3 years ago

I also noticed that longer packets fail quicker. 64 bytes packets fail after ~100 transactions. 5 byte packets last for ~4000 transactions. 4 byte packets never fails. Almost smells like a FIFO problem.

Is there any device or EP status register that I can check for STALL/HALT or whatever is causing this?

dmitrystu commented 3 years ago

Checked with a logic analyzer on the bad transaction. Screenshot from 2020-11-30 14-28-23 Really, i don't understand WTF is going on. Why It issues BAD_CRC in the middle of the packet.

dmitrystu commented 3 years ago

This is the normal transaction. Screenshot from 2020-11-30 14-39-26

dmitrystu commented 3 years ago

Looks like this is a false EOP (see chapter 8.7.3 USB2.0) but why? And why I see no similar with CDC loopback demo at heavy load.

/dev/ttyACM1: count for this session: elapsed=2030 sec, rx=1231213440, tx=1231224832, rx err=0
/dev/ttyACM1: transfer rate rx=606509 cps, 4852072 bps, tx=606514 cps, 4852112 bps.
/dev/ttyACM1: TIOCGICOUNT: ret=0, rx=0, tx=0, frame = 0, overrun = 0, parity = 0, brk = 0, buf_overrun = 0
dmitrystu commented 3 years ago

I found a possible root cause of this issue. Since your code uses USB interrupt for the events and async USB EP write call in the main thread, a conflict with shared access to the USB registers may be possible. I just disabled IRQs till the writing finished and see no more errors on the bus.

jhbe commented 3 years ago

I'll try that! What's the correct way of disabling and re-enabling the interrupt?

dmitrystu commented 3 years ago

I used this code for the test.

__disable_irq();
usbd_ep_write(&usbdDevice, EP1_IN, buffer, 40);
__enable_irq();

This is not a good idea to disable all interrupts, it was done just for the test. I think you need to redesign the code to avoid switching to OTG_FS_IRQHandler when usbd_ep_write() in progress.

jhbe commented 3 years ago

USB_OTG_GINTSTS_SOF was the one interrupt that came in while I was writing AND it came in just as the pipe broke. When I define USBD_SOF_DISABLED, the problem goes away. I can't say I understand why; we only clear the interrupt bit and move on...

Is the problem that we are writing to a register while writing (like you said) or is it a SOF issue?

dmitrystu commented 3 years ago

I think this is the hardware internals. For example, writing to GINTSTS or maybe other registers when pushing to the EP FIFO not yet completed will break a correct writing sequence and may corrupt an internal state. And when you disabled SOF you reduced the probability of this error but did not exclude.

jhbe commented 3 years ago

Closing this issue. The workaround (disable interrupt white writing or write from the interrupt scope) works. Thank you so much for your help @dmitrystu , you have written an excellent stm32 USB library!