zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.9k stars 6.64k forks source link

USB CDC_ACM locks system or at least slows it down heavily when no terminal is connected (USB DEVICE-NEXT) #78104

Closed SynchronicIT closed 3 weeks ago

SynchronicIT commented 2 months ago

Describe the bug When running a device (using the device-next drivers / subsystem) using the cdc-acm class, the complete system can be slowed down or even locked. In this case, we have used a self-powered device using a nRF52833, but is looks to apply for any chipset.

When there's data transfer sent to the cdc_acm and there's no terminal conntected, the tx_fifo worker keeps running in circles.

at zephyr/subsys/usb/device_next/class/usbd_cdc_acm.c:545 the worker task (static void cdc_acm_tx_fifo_handler(struct k_work *work)), schedules submits it's own work task when there is no buffer. This would give an infinite loop. As the USB stack is running at prio -2. nothing else can be executed anymore.

I notice this with a linux PC, but I can imagine the same will happen when there's no PC connected at all.

To Reproduce Build the passthrough example (zephyr/samples/drivers/uart/passthrough) with an actual uart on one end and the cdc-acm on the second. Send a lot of data over the physical uart which will start filling the the ringbuffers and fifo of the USB side. Connect the board to a (linux) pc and connect a terminal app. you would receive data and as soon as you close the terminal, the application will get stuck in the USB driver layer. (E.g. add printk("+") and printk("-") in the callback functions. Two ringbuffer blocks will be forwarded and then everything stops.

Expected behavior If the host doesn't consume the data in the endpoint / cdc-acm part is not satisfied and you run out of net buffers for this endpoint, do not constantly retry, but but let the other part of the usb device driver check if there is something in the tx_fifo as soon as the net buf is freed.

Impact This is currently a showstopper for several USB based applications as the performance is killed / unexpected when using CDC-ACM

Environment (please complete the following information):

jfischer-no commented 2 months ago

Build the passthrough example (zephyr/samples/drivers/uart/passthrough)

This sample uses the UART API incorrectly and I am pretty sure you cannot use it with the CDC ACM implementation as it checks in what context uart_fifo_fill() is called. Please provide information on how to reproduce this with an sample in tree that uses the UART API correctly.

SynchronicIT commented 2 months ago

Hereby an example to reproduce the issue. passthrough_irq_usb_uart.zip

I've made a potential fix, with the best of knowledge of the device_next implementation. I'll add a pull request for it (and I'll put in this example too)

SynchronicIT commented 2 months ago

I am using a board / DTS where we use 2 CDC-ACM instances. Only one we enable the interrupts and fill in the callback.

There are two major things which I notices.

This could be related to the issue I've seen or not. Still investigating this.

mmahadevan108 commented 2 months ago

@SynchronicIT , can you please retry with the fix for https://github.com/zephyrproject-rtos/zephyr/issues/76642