Xilinx / embeddedsw

Xilinx Embedded Software (embeddedsw) Development
Other
941 stars 1.07k forks source link

XUsbPs_ConfigureDevice: all sorts of bugs related to DMA buffer size and alignment #50

Open skoehler opened 6 years ago

skoehler commented 6 years ago

The documentation of XUsbPs_ConfigureDevice mentions a macro called XUsbPs_DeviceMemRequired() but it doesn't exist. A macro of the same name is mentioned in the examples as well. So the required size of the buffer is unknown. In the examples, I find a buffer size of 64*1024. But it is unclear where this "magic constant" came from or whether is enough for a given XUsbPs_DeviceConfig.

Also, XUsbPs_ConfigureDevice does currently not check the buffer size which seems like a bad idea to me. Also, the buffer size is currently not passed to XUsbPs_ConfigureDevice, since the XUsbPs_DeviceConfig structure lacks a field for the buffer size.

The computation of DeviceConfig.PhysAligned is wrong. It should round to the next multiple of 2048, but it will round 2048 to 4096, 4096 to (4096+2048), and so forth. Add only XUSBPS_dQH_BASE_ALIGN - 1 before masking out the lower bits. See the code below, which lacks the -1 and was taken from the latest version of xusbps_endpoint.c:

    /* Align the buffer to a 2048 byte (XUSBPS_dQH_BASE_ALIGN) boundary.*/
    InstancePtr->DeviceConfig.PhysAligned =
        (InstancePtr->DeviceConfig.DMAMemPhys +
                     XUSBPS_dQH_BASE_ALIGN) &
                        ~(XUSBPS_dQH_BASE_ALIGN -1);

Also, all examples pass an 32-byte aligned buffer - which is then rounded up to 2048-byte alignment. Shouldn't the examples pass a buffer with proper alignment instead?

eorojas commented 4 years ago

Their USB example leaves a lot to be desired. Were you able to resolve any of this? I'm finding it painful to try and understand the code and the documentation.

skoehler commented 4 years ago

I have changed and partly rewritten lots of the USB driver. The biggest issue of this driver probably is that you have to process data in the interrupt handler. Typically, the USB controller works its way through a linked list (typically a ring) of DMA buffers. Before the USB controller writes to a DMA buffer, the driver has to mark that buffer as active. If a DMA buffer is full, then it is marked as inactive by the USB controller and an interrupt happens. The interrupt handler of the current USB driver always marks the DMA buffer as active after passing it to your application's interrupt handler. So either you process the data in the application's interrupt handler or the data is overwritten by the USB controller!

This effectively circumvents the most basic principle of USB bulk transfers: the flow control! That means, that such a simple thing as CDC ACM devices (aka virtual serial ports) cannot be implemented with this driver. Let me explain:

If the host sends too much data, then the Zynq's USB controller must tell the host to resend the data later. With the USB controller of the Zynq that happens only if the USB controller runs out of active DMA buffers. That, as I explain above, cannot happen since the driver masks inactive buffer as active as fast as possible.

Virtual serial ports rely on the flow control of USB bulk transfers! That is very very basic. Your Windows or Linux application writes the data to the virtual serial port as fast as it can, and, at some point, the flow control of the USB bulk transfer kicks in and makes sure that no data is lost.

So I changed the driver so that the DMA buffers are only marked as active after the application has actually processed the data. That processing may not necessarily happen in interrupt context. Typically, it will happen in the main loop.

I found an example on xilinx.com where somebody showed how to implement a virtual serial port that forwards data to an actual UART. And guess what: that example would throw data away, if the host would send data too fast. That is NOT how virtual serial ports work. Have you ever noticed that your trusty USB UART adapter (ft232, cp210x, ch340g, etc.) throws data away? I haven't - and yes, they also use USB bulk transfers for the data.

Needless to say that Linux has a proper driver for this device without such issues. So the hardware is certainly capable, but the Xilinx standalone driver design is certainly misguided.

Stay away from this driver unless you know what you are doing. This driver is broken by design.

eorojas commented 4 years ago

Yes, I see that. I'll look at that issue with marking the buffers active. Thanks for the heads you up on that.

I'm trying to have the data available call block on a queue. When we have more time I think we'll port the libusb code.

More tomorrow.

On Wed, Mar 18, 2020, 19:40 skoehler notifications@github.com wrote:

I have changed and partly rewritten lots of the USB driver. The biggest issue of this driver probably is that you have to process data in the interrupt handler. Typically, the USB controller works its way through a linked list (typically a ring) of DMA buffers. Before the USB controller writes to a DMA buffer, the driver has to mark that buffer as active. If a DMA buffer is full, then it is marked as inactive by the USB controller and an interrupt happens. The interrupt handler of the current USB driver always marks the DMA buffer as active after passing it to your application's interrupt handler. So either you process the data in the application's interrupt handler or the data is overwritten by the USB controller!

This effectively circumvents the most basic principle of USB bulk transfers: the flow control! That means, that such a simple thing as CDC ACM devices (aka virtual serial ports) cannot be implemented with this driver. Let me explain:

If the host sends too much data, then the Zynq's USB controller must tell the host to resend the data later. With the USB controller of the Zynq that happens only if the USB controller runs out of active DMA buffers. That, as I explain above, cannot happen since the driver masks inactive buffer as active as fast as possible.

Virtual serial ports rely on the flow control of USB bulk transfers! That is very very basic. Your Windows or Linux application writes the data to the virtual serial port as fast as it can, and, at some point, the flow control of the USB bulk transfer kicks in and makes sure that no data is lost.

So I changed the driver so that the DMA buffers are only marked as active after the application has actually processed the data. That processing may not necessarily happen in interrupt context. Typically, it will happen in the main loop.

I found an example on xilinx.com where somebody showed how to implement a virtual serial port that forwards data to an actual UART. And guess what: that example would throw data away, if the host would send data too fast. That is NOT how virtual serial ports work. Have you ever noticed that your trusty USB UART adapter (ft232, cp210x, ch340g, etc.) throws data away? I haven't - and yes, they also use USB bulk transfers for the data.

Needless to say that Linux has a proper driver for this device without such issues. So the hardware is certainly capable, but the Xilinx standalone driver design is certainly misguided.

Stay away from this driver unless you know what you are doing. This driver is broken by design.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Xilinx/embeddedsw/issues/50#issuecomment-600956929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3CCTUB2EX4O3RMKAQXCMTRIGAYFANCNFSM4FOTSQWA .

eorojas commented 4 years ago

Well, I just got it working with a queue notification via a FreeRTOS (FR) queue. Thanks for your tip about releasing the buffer.
I've hacked into the EP3 bulk handling, it ain't pretty but it working. FreeRTOS is so hacky. If you are interested in details let me know.

eorojas commented 4 years ago

@sthokal, thanks for your previous comments, they were very helpful, but I'm running into a problem despite moving the buffer-release to after I've consumed the data. This only happens when I deliberately blast the USB port with data. The interesting thing that happens is that after some 200+ transactions (in this mode only) XUsbPs_EpBufferReceive() returns a buffer length of zero, but still claims SUCCESS via the return code. I get exactly 16 of these, which is what is allocated by the init. I'm wondering if this sounds familiar or you have any suggestions on how to track down the problem? Can you share anything else about what you did? Is what you did in the public domain or otherwise available? thanks

skoehler commented 4 years ago

A virtual serial port will send zero length packets sometimes. That is normal. Are you calling XUsbPs_EpBufferReceive() from a task or from an interrupt handler? What you are experiencing sounds like a race condition.

eorojas commented 4 years ago

Hmm, I XUsbPs_EpBufferReceive() from the interrupt handler, I package the results and the handle into a message data structure and send it via a FreeRTOS queue to a single task (temporary for dev). I copy the data and release the data and send it back the other way. Nothing else is happening.
It occurs to me that it might be that I'm getting blocked on the send side. I have not explored that side of what the code is doing. I assume your changes are not available, is that correct?

skoehler commented 4 years ago

I worked on the driver in the context of a commercial project. My changes are not available in public domain. I did a lot of changes and reviewed all code related to bulk transfers. I think there were several cases where a buffer was marked as active before all attributes of the buffer were set. I also recall cache invalidation issues. I also looked into the Linux driver to see what it was doing.

I am not using the driver in the context of FreeRTOS, yet. But what you did sounds about right.

veljkosvica commented 1 year ago

Just have to add to this topic that this issue is very much still alive (struggling with it right now). How they are not prioritizing having a (better) working example for the world's most used port - beats me. If anyone has any driver examples that work better than their example - I would be very gratefull!

eorojas commented 1 year ago

Try this guy, https://www.suburbanembedded.com/, t/he/y did a great job for me.

anirudha1977 commented 1 year ago

Hi everyone, Sorry to learn abt the gaps and issues. We went through them, and all the concerns look valid. We have taken note of this and have created a tracking item in our internal system. Will address them soon in the future.

Thanks again for raising the concerns.

regards