IntergatedCircuits / USBDevice

Highly flexible Composite USB Device Library
Apache License 2.0
216 stars 45 forks source link

CDC bulk transfer stalls when USBD_CDC_Transmit called too often #23

Open VaZso opened 4 years ago

VaZso commented 4 years ago

Dear Benedek,

I am connecting my STM32F407VET6 microcontroller to a PC running Debian Linux. On the controller, I have a relatively huge ring buffer where I put data for sending through USB.

Initially, I had a check of data count in ring buffer in every "round" and if so, I have called USBD_CDC_Transmit and if it was returned USBD_E_OK, I thought it was sent. Otherwise, USBD_CDC_Transmit is called again in the next round.

When I write to my buffer too often, I am experiencing the communication halt until I reconnect the USB device (device may remain powered).

My solution calls USBD_CDC_Transmit with a buffer directly pointing to my ring buffer position which will not be rewritten while it still working on it (it is large). When the data to be sent is greater than the remaining space till the end of my ring buffer memory, then USBD_CDC_Transmit is called only till the end of the buffer and in the very next round, it is called from the beginning of my ring buffer till the remaining data count.

I think this was the point when I have experienced most of my communication halts, so when I called transmit routine, then the very next time (so just when it became not busy) called it again to send remaining data.

If I print to my buffer and call USBD_CDC_Transmit in every round, it halts even sooner.

Now I put a 50ms delay of calling transmit routine and now it seems communication does not halt. Later: it halted after maybe an hour. When I have called it in every 20ms, it was halted maybe in 10-20 minutes. If I set it to be called in every "1 ms" then it halts after a few seconds.

However, I see USBD_CDC_Transmit has USBD_E_OK result only when (ep->State == USB_EP_STATE_IDLE) and (ep->Type == USB_EP_TYPE_ISOCHRONOUS). If the result is not USBD_E_OK, my routine does nothing but tries to send data in the next call of it. Also USBD_CDC_Transmit (USBD_EpSend) does not do anything this case.

...but if ep->State equals to USB_EP_STATE_IDLE, it should be ready for sending another data, but it seems it is not always ready yet.

Maybe it is something like a corner case so it is not necessarily a bug in USBDevice itself. Do you have any ideas of a potential solution?

I mean it would be good if the communication could keep working somehow even if USBD_CDC_Transmit is called more often.

As of the Wireshark capture below, I see device requests an interrupt even when host stops receiving new data, then host sends the interrupt request (which is as should), but data is not arriving from device.

I have also checked the software still calls USBD_CDC_Transmit which returns BUSY as ep->State is USB_EP_STATE_DATA, my data count in buffer is increasing, but it seems state machine does not go out of USB_EP_STATE_DATA.


This is my older message:

I have done a Wireshark capture. When I open the port at package 95, it sends out the internal buffer. From package 153, it sends only the counter ("Counter: 770"). At package 10151, I see my last received counter value ("Counter: 1093"). I don't know why I see the same data again at 10153, however I also saw it earlier like at package 9427.

So after that, I have not receiving any data, but I see "URB_INTERRUPT in" messages from package 10189 until I close opened port maybe near at package 10451.

So it seems to be unrelated to 64 byte-specific problem.

Anyway, in capture, I see "URB Interrupt in" messages maybe in every 242 messages, so device sends it to host then host sends it to device and a messages comes right after in the next package. However, when the connection freezes, I see the same "URB Interrupt in" ping-pont but without the message.

So it seems to me the message interrupt comes to host but message is not arriving.

ttyacm.pcapng.gz

VaZso commented 4 years ago

I have changed calling interval of my routine which checks for data available and calls USBD_CDC_Transmit from 50 ms to 80 ms and so far it still lives after hours without halt.

So it sends "Counter: 338662770\r\n"-style messages frequently and checks if any data in my buffer in every 80 ms, so I have a burst in every 80 ms. I put the output in a file, increased my buffer and I could even send around 26 MB of data in a minute, around 428 kB/s.

...but if I do smaller bursts like sending in every 20 ms, communication will halt sooner or later, especially if I call it in every ms. Am I crazy and it is not a valid usage of CDC-ACM or something else is the cause of this problem?

Earlier I have used USBDevice for debug-like purposes on a measuring unit which measures some parameters and have a new row in every 8 ms - nearly 200 byte per row but it depends on the measured parameters and the presence of GPS signal so I may or may not ran in the 64-byte length / ZLP problem recently solved and also this other problem of frequent calls (the final output of that system is passed on Ethernet anyway).

benedekkupper commented 4 years ago

What is the maximum length that you call USBD_CDC_Transmit() with?

VaZso commented 4 years ago

Hmm, that is a good question. Basically it does not have a limit check but a currently 10 kB buffer filled with data and it passes the length of data in the buffer (or less if the buffer memory turns around but that case it is called again to send remaining bytes in the next round). What limit should I use?

I think when I had an even larger buffer and more often write and call of USBD_CDC_Transmit() in every 80 ms, then it was called with a length of around 34 kB but that case it did not halt....

Now I have compiled another code which prints the counter when difference of ms timer is greater than 1 and I call the check of my buffer length in every round which calls USBD_CDC_Transmit() if anything in the buffer (so basically new data arrives in every 2 ms).

So that way it counted up to 61587 before halt. This row has a length of 17 bytes but I have not checked the actual data length - however, I think it should be no more than this 17 bytes now.

Also tried to limit the running frequency of buffer check, so USBD_CDC_Transmit() was called in every 2 ms on new data (in the same round anyway) and now it counted up to 260999, then halted. Also this time it should have definitively no more than 34 bytes in the buffer (maybe 33 bytes when the buffer just turned around in previous round and another line was generated in this run).

Anyway, when I call buffer check in every 80 ms and I have a new data in every 2 ms then it should have around 680 bytes in the buffer.

I can easily put in a limit of length anyway but what is the limit I should use here?

VaZso commented 4 years ago

I left the program running with 80 ms buffer check (where it calls USBD_CDC_Transmit() function) and row generation of 2 ms. Now it has this value "Counter: 10883997"

It means it is running now a bit more than 6 hours without halt and is still running.

Anyway, this way it definitively has more than 64 bytes to be sent every time. I don't know if I generate shorter or longer messages that would cause any problems or not.

Maybe I will try sending one row in every 80 ms to see if shorter than 64 bytes of length will cause the halt of communication in longer term or it is not related.

VaZso commented 4 years ago

So I had a software running which sent only less than 64 bytes in every 80 ms.

This was the same counter starting from 0 and it was halted on this value: "Counter: 101599" ...that means it halted after around 135 minutes.

...interesting...

benedekkupper commented 4 years ago

I've checked the hardware limitation, the USB OTG is theoretically capable of handling transfers up to 524 287 bytes (see XFRSIZ field), so that's unlikely to be the problem here.

I don't really know what to look at here, maybe cross-check with the ST stack, and if it's reproducible there as well, then report it to them.