dmitrystu / libusb_stm32

Lightweight USB device Stack for STM32 microcontrollers
Apache License 2.0
721 stars 165 forks source link

STM32L433: Make double buffer work for bulk endpoints in ep_write #120

Open Solartraveler opened 2 years ago

Solartraveler commented 2 years ago

Hello, thank you very much for the library. While using the lib for implementing USB mass storage, I failed to get a bulk endpoint to work with USB_EPTYPE_DBLBUF set in usbd_ep_config for endpoint 0x81. It turns out the endpoint is in the state USB_EP_TX_VALID and not USB_EP_TX_NAK. In both states data can be written. Without double buffering the transfer rate to the host was limited to ~100KB/s. With double buffering, I can reach ~1100KB/s. I tested the code with a STM32L452. I guess some of the the other drivers needs a similar fix, if required I could test it with a STM32F042 too.

(Should anyone look through the pull request in order to figure out how to use the lib correctly (as I did), my mass storage implementation can be found here: https://github.com/Solartraveler/UniversalboxArm/tree/main/src/stm32l452/06-usb-mass-storage )

Malte

dmitrystu commented 1 year ago

I hope, it was tested both for single-buffered and double-buffered write. BTW, with linux-serial-test got about 8Mbps for both TX and RX channels (~4Mbps each) on the standard driver / loopback demo code. Need to test with changes.

Solartraveler commented 1 year ago

Thanks for the replay and pointing me to the linux-serial-test program :) I would expect this change to only affect double-buffered writes, as for single-buffered USB_EP_KIND would not be set. I compiled your CDC demo for my STM32F042 with polling and without the HID combo. linux-serial-test manages to send about 98000Byte/s, which is about the same I get with my mass storage program on the STM32L452. But both are USB 1.1 devices. I expect an USB 2.0 device to be a lot faster even withouth double buffering. Interestingly linux-serial-test finds errors every few seconds for me.

Maybe I find time to rewrite the CDC sample to use double buffering and then have a look if the data rate improves there too.

Note: While implementing double buffering, I first used the

usbd_reg_event(dev, usbd_evt_eptx, EndpointEventTx);

callback to know when to queue a next packet, starting with two packets. This worked until I put some more load to the bus system (switched from polling to DMA for SPI). Then there was sometimes only one callback for two packets sent, resulting in falling back to the speed of single buffer sending. I fixed this by looking into the status register to know if I can queue one or two packets:

//function copied from usb stack:
inline static volatile uint16_t *EPR(uint8_t ep) {
    return (uint16_t*)((ep & 0x07) * 4 + USB_BASE);
}

void EndpointEventTx(usbd_device *dev, uint8_t event, uint8_t ep) {
    if ((ep == USB_ENDPOINT_TOHOST) && (event == usbd_evt_eptx)) {
        g_storageState.toHostFree++;
#ifdef USB_USE_DOUBLEBUFFERING
        g_storageState.toHostFree = MIN(g_storageState.toHostFree, 2);
        if (g_storageState.toHostFree == 1) {
            uint16_t reg = *EPR(USB_ENDPOINT_TOHOST);
            uint16_t swbuf = reg & USB_EP_DTOG_RX; //used as swbuf bit with double buffering
            uint16_t dtog = reg & USB_EP_DTOG_TX;
            if (((swbuf) && (dtog)) || ((swbuf == 0) && (dtog == 0))) {
                g_storageState.toHostFree = 2;
            }
        }
#endif
    //<Send 1..2 packages code here>
    }
}

If you would like to see any additional types of tests, let me know.