zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.6k stars 6.49k forks source link

cdc acm drops packets silently #72786

Open Explorer001 opened 4 months ago

Explorer001 commented 4 months ago

Describe the bug When sending usb packets continuously apparently some are dropped. There are no logs from the usb subsystem that bytes were dropped. I was testing on a nrf52840dk with the cdc_acm_echo sample.

To Reproduce Steps to reproduce the behavior:

  1. edit build the cdc_acm sample
  2. send chunks of increasing bytes (e.g 128 bytes) and read them back in a loop
  3. observe missing bytes in the sequence after some time

Adapted interrupt handler

uint8_t tx_byte = 0x00;
bool rx_err = false;
bool tx_err = false;

static void interrupt_handler(const struct device *dev, void *user_data)
{
    ARG_UNUSED(user_data);

    while (uart_irq_update(dev) && uart_irq_is_pending(dev)) {
        if (!rx_throttled && uart_irq_rx_ready(dev)) {
            int recv_len, rb_len;
            uint8_t buffer[64];
            size_t len = MIN(ring_buf_space_get(&ringbuf),
                     sizeof(buffer));

            if (len == 0) {
                /* Throttle because ring buffer is full */
                uart_irq_rx_disable(dev);
                rx_throttled = true;
                continue;
            }

            recv_len = uart_fifo_read(dev, buffer, len);
            if (recv_len < 0) {
                LOG_ERR("Failed to read UART FIFO");
                recv_len = 0;
            };

            for (int i = 0; i < recv_len; i++) {
                if (!rx_err && (buffer[i] != rx_byte)) {
                    LOG_ERR("Received byte does not match expected byte! %02X != %02X", buffer[i], rx_byte);
                    rx_err = true;
                }
                rx_byte++;
            }

            rb_len = ring_buf_put(&ringbuf, buffer, recv_len);
            if (rb_len < recv_len) {
                LOG_ERR("Drop %u bytes", recv_len - rb_len);
            }

            LOG_DBG("tty fifo -> ringbuf %d bytes", rb_len);
            if (rb_len) {
                uart_irq_tx_enable(dev);
            }
        }

        if (uart_irq_tx_ready(dev)) {
            uint8_t buffer[64];
            int rb_len, send_len;

            rb_len = ring_buf_get(&ringbuf, buffer, sizeof(buffer));
            if (!rb_len) {
                LOG_DBG("Ring buffer empty, disable TX IRQ");
                uart_irq_tx_disable(dev);
                continue;
            }

            for (int i = 0; i < rb_len; i++) {
                if (!tx_err && (buffer[i] != tx_byte)) {
                    LOG_ERR("Sent byte does not match expected byte! %02X != %02X", buffer[i], tx_byte);
                    tx_err = true;
                }
                tx_byte++;
            }

            if (rx_throttled) {
                uart_irq_rx_enable(dev);
                rx_throttled = false;
            }

            send_len = uart_fifo_fill(dev, buffer, rb_len);
            if (send_len < rb_len) {
                LOG_ERR("Drop %d bytes", rb_len - send_len);
            }

            LOG_DBG("ringbuf -> tty fifo %d bytes", send_len);
        }
    }
}```

**Expected behavior**
No bytes being dropped

**Impact**
Showstopper for a usb to serial converter application

**Logs and console output**
I added byte counters to the sample to observe on which side the bytes are being dropped.
Apparently they are dropped on the usb receive side.

Booting Zephyr OS build v3.6.0-3610-g5cbf7b1b9d13 [00:00:00.250,183] cdc_acm_echo: Wait for DTR [00:00:00.254,211] usb_cdc_acm: Device suspended [00:00:00.541,503] usb_cdc_acm: Device resumed [00:00:00.727,050] usb_cdc_acm: Device configured [00:00:02.751,922] cdc_acm_echo: DTR set [00:00:02.869,049] cdc_acm_echo: Baudrate 115200 [00:02:35.662,628] cdc_acm_echo: Received byte does not match expected byte! C0 != 80 [00:02:35.662,689] cdc_acm_echo: Sent byte does not match expected byte! C0 != 80


Also the logs from the test script:

Starting echo test on /dev/ttyACM2 (3) baud=115200, size=128 Received byte does not match expected byte: C0 != 80 Last sent on 3: 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF

Last received on 3: C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE Echo test terminated after 153 seconds



It seems like the first 64 bytes are dropped on the usb receive side of the nrf52.

**Environment (please complete the following information):**
 - OS of the test host: Debian 11 (bullseye)
 - Zephyr SDK; 0.16.3
 - Zephyr commit: 5cbf7b1b9d13a39f36ad8dabfaeaec21d98cf5ff
nordicjm commented 4 months ago

Have you tried https://github.com/zephyrproject-rtos/zephyr/pull/65112 ?

jfischer-no commented 4 months ago

I was testing on a nrf52840dk with the cdc_acm_echo sample.

[00:02:35.662,628] cdc_acm_echo: Received byte does not match expected byte! C0 != 80 [00:02:35.662,689] cdc_acm_echo: Sent byte does not match expected byte! C0 != 80

Which sample is it?

Explorer001 commented 4 months ago

Its the cdc_acm sample: samples/subsys/usb/cdc_acm

jfischer-no commented 4 months ago

Its the cdc_acm sample: samples/subsys/usb/cdc_acm

The log output clearly shows that it is not the sample in the tree, please add your modifications to the bug description.

Explorer001 commented 4 months ago

Added the modified interrupt handler

jfischer-no commented 4 months ago

No idea why git diff does not work for you, but assuming your changes are:

diff --git a/samples/subsys/usb/cdc_acm/src/main.c b/samples/subsys/usb/cdc_acm/src/main.c
index 91f7a8db7f7..012f1a6b893 100644
--- a/samples/subsys/usb/cdc_acm/src/main.c
+++ b/samples/subsys/usb/cdc_acm/src/main.c
@@ -108,6 +108,11 @@ static int enable_usb_device_next(void)
 }
 #endif /* IS_ENABLED(CONFIG_USB_DEVICE_STACK_NEXT) */

+uint8_t tx_byte = 0x00;
+uint8_t rx_byte = 0x00;
+bool rx_err = false;
+bool tx_err = false;
+
 static void interrupt_handler(const struct device *dev, void *user_data)
 {
        ARG_UNUSED(user_data);
@@ -132,6 +137,14 @@ static void interrupt_handler(const struct device *dev, void *user_data)
                                recv_len = 0;
                        };

+                       for (int i = 0; i < recv_len; i++) {
+                               if (!rx_err && (buffer[i] != rx_byte)) {
+                                       LOG_ERR("Received byte does not match expected byte! %02X != %02X", buffer[i], rx_byte);
+                                       rx_err = true;
+                               }
+                               rx_byte++;
+                       }
+
                        rb_len = ring_buf_put(&ringbuf, buffer, recv_len);
                        if (rb_len < recv_len) {
                                LOG_ERR("Drop %u bytes", recv_len - rb_len);
@@ -154,6 +167,14 @@ static void interrupt_handler(const struct device *dev, void *user_data)
                                continue;
                        }

+                       for (int i = 0; i < rb_len; i++) {
+                               if (!tx_err && (buffer[i] != tx_byte)) {
+                                       LOG_ERR("Sent byte does not match expected byte! %02X != %02X", buffer[i], tx_byte);
+                                       tx_err = true;
+                               }
+                               tx_byte++;
+                       }
+
                        if (rx_throttled) {
                                uart_irq_rx_enable(dev);
                                rx_throttled = false;

I cannot reproduce it on nRF52840, my steps are something like:

Python 3.11.9 (main, Apr 10 2024, 13:16:36) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import serial
>>> dev = serial.Serial('/dev/ttyACM1', 115200, timeout=0.1, writeTimeout=0.1)
>>> data_out = serial.to_bytes(range(256))
>>> dev.writable()
True
>>> dev.write(data_out)
256
>>> dev.readable()
True
>>> data_in = dev.read(len(data_out))
>>> if data_out != data_in:
...     print("Error");
... 
>>> 
Explorer001 commented 4 months ago

The changes look correct. It does not happen on the first packet. Try sending and receiving in a loop. On my system the byte drop happens after several minutes.

jfischer-no commented 4 months ago

The changes look correct. It does not happen on the first packet. Try sending and receiving in a loop. On my system the byte drop happens after several minutes.

Well, with example above I do not see any issue running in a loop for about 10 minutes.

It seems like the first 64 bytes are dropped on the usb receive side of the nrf52.

What is the exact name of the SoC or development board?

Explorer001 commented 4 months ago

What is the exact name of the SoC or development board?

nRF21540-DK

jfischer-no commented 2 months ago

While I cannot reproduce it on nRF52840, I noticed that the commit messages in #73828 mention a different platform, RP2040. On rpi_pico I can reproduce it with pyserial script using it as I have outlined above. It looks like some bytes get lost in the to-host direction (https://github.com/zephyrproject-rtos/zephyr/pull/75739). But I do not see k_work_submit in usb_transfer() failing, which is also not to be expected.

diff --git a/subsys/usb/device/usb_transfer.c b/subsys/usb/device/usb_transfer.c
index 95cf3198ab6..6ce15837bd1 100644
--- a/subsys/usb/device/usb_transfer.c
+++ b/subsys/usb/device/usb_transfer.c
@@ -251,7 +251,8 @@ int usb_transfer(uint8_t ep, uint8_t *data, size_t dlen, unsigned int flags,

        if (flags & USB_TRANS_WRITE) {
                /* start writing first chunk */
-               k_work_submit_to_queue(&USB_WORK_Q, &trans->work);
+               ret = k_work_submit_to_queue(&USB_WORK_Q, &trans->work);
+               __ASSERT(ret >= 0, "Failed to submit new transfer %d", ret);
        } else {
                /* ready to read, clear NAK */
                ret = usb_dc_ep_read_continue(ep);
github-actions[bot] commented 3 weeks ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.