embassy-rs / embassy

Modern embedded framework, using Rust and async.
https://embassy.dev
Apache License 2.0
5.52k stars 768 forks source link

ERROR panicked at 'BufferedUarte overrun' on NRF52840 #2181

Closed sureshjoshi closed 12 months ago

sureshjoshi commented 12 months ago

I have a task that sits in a loop and tries to consume the UART RX buffer, but I seem to be running into the buffer overrun panic. It feels like a "slow leak" as depending on the RX buffer, it might take a few seconds to panic (128 byte buffer, up to 10-15 minutes with the 1024 byte buffer.

The UART is receiving 60 bytes every second during this test, no writes.

Four questions:

  1. At 60 bytes per second, how could I possibly get a buffer overrun when the task only fills and then consumes (with a log in the middle)?
  2. Is there a less aggressive option other than panic on an overrun? Is there no safe way to recover from that? Crashing the application feels aggressive.
  3. Does the choice of Timer or PPI impact this? Should I be configuring my UART differently? I can't use Timer0 as then my SoftDevice will assert out.
  4. Does the BufferedUart adhere to the CTS/RTS lines and apply back pressure to it?
let mut stm32_uart_config = uarte::Config::default();
    stm32_uart_config.baudrate = uarte::Baudrate::BAUD921600;
    stm32_uart_config.parity = uarte::Parity::EXCLUDED;

    let stm32_ring_buffer_rx = STM32_RING_BUFFER_RX.init_with(|| [0u8; 1024]);
    let stm32_ring_buffer_tx = STM32_RING_BUFFER_TX.init_with(|| [0u8; 1024]);

    let stm32_uart = STM32_BUFFERED_UARTE.init(BufferedUarte::new_with_rtscts(
        p.UARTE0,
        p.TIMER1,
        p.PPI_CH0,
        p.PPI_CH1,
        p.PPI_GROUP0,
        Irqs,
        p.P0_29,
        p.P1_13,
        p.P0_31,
        p.P1_12,
        stm32_uart_config,
        stm32_ring_buffer_rx,
        stm32_ring_buffer_tx,
    ));

#[embassy_executor::task]
pub async fn stm32_uart_reader(
    mut uart: BufferedUarteRx<'static, 'static, peripherals::UARTE0, peripherals::TIMER1>,
    mut receiver: drivers::stm32::UartReceiver,
) -> ! {
    info!("stm32_uart_rx: Starting...");
    loop {
        let buf_len = {
            let Ok(buf) = uart.fill_buf().await else {
                info!("stm32_uart_rx: Error filling buffer!");
                continue;
            };
            buf.len()
        };
        info!("Consuming {} bytes from buffer", buf_len);
        uart.consume(buf_len);
    }
}

This is a Rust re-write of an existing C++ application which has been working fine in the field for many years with a way higher data rate, so I'm confident in the hardware and code on the other side.

Dirbaio commented 12 months ago

can you try uncommenting all trace! in here and post trace-level logs of the issue? (DEFMT_LOG=trace)

At 60 bytes per second, how could I possibly get a buffer overrun when the task only fills and then consumes (with a log in the middle)?

Agreed this shouldn't happen, yep.

Is there a less aggressive option other than panic on an overrun? Is there no safe way to recover from that? Crashing the application feels aggressive.

Yes, it's possible in theory, I just didn't implement it because it's a bit tricky. It'd have to stop UART RX, reinit the "timer as counter", flush everything, then restart. All with no race conditions... PRs welcome :smiling_face_with_tear:

Does the choice of Timer or PPI impact this? Should I be configuring my UART differently? I can't use Timer0 as then my SoftDevice will assert out.

shouldn't matter, no...

Does the BufferedUart adhere to the CTS/RTS lines and apply back pressure to it?

Yes. You have to enable RTS/CTS on the other side of course.

sureshjoshi commented 12 months ago

can you try uncommenting all trace! in here and post trace-level logs of the issue?

Yep, I was just doing that when I saw they were all commented out at the moment. Will be posting shortly.

Yes, it's possible in theory, I just didn't implement it because it's a bit tricky. It'd have to stop UART RX, reinit the "timer as counter", flush everything, then restart. All with no race conditions... PRs welcome

Oof... Maybe after I can unblock myself on this ticket. I have a few private branches that I'm working on with PRs loaded up (only private because of all the proprietary ANT+ code, that I need to hide away first).

Yes. You have to enable RTS/CTS on the other side of course.

Yep, I'm a 4-wire UART kinda guy

sureshjoshi commented 12 months ago

Full trace attached: trace.log

Test using configuration from above, except 128 byte RX buffer, and iterating and printing each byte as received so that I can force replicate the problem faster:

let stm32_ring_buffer_rx = STM32_RING_BUFFER_RX.init_with(|| [0u8; 128]);

Last trace before panic.

6.865875 INFO  Consuming 14 bytes from buffer
└─ boards::uart::__stm32_uart_reader_task::{async_fn#0} @ src/uart.rs:32
6.866149 TRACE   ringbuf: pop 14
└─ embassy_hal_internal::atomic_ring_buffer::{impl#2}::pop_done @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
6.866424 TRACE poll_read
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.866607 TRACE   rxdrdy count = 255
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.866851 TRACE   empty
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.867065 TRACE irq: start
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.867309 TRACE irq: end
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.867584 TRACE poll_read
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.867767 TRACE   rxdrdy count = 255
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
6.868041 TRACE   empty
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.849761 TRACE irq: start
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.850006 TRACE   irq_rx: rxstarted
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.850189 TRACE   ringbuf: push_buf 64..127
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_buf @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
7.850524 TRACE   irq_rx: rxstarted no buf
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.850708 TRACE irq: end
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.850952 TRACE poll_read
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.851135 TRACE   rxdrdy count = 58
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.851379 TRACE   uarte ringbuf: pop_buf 127..128
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_fill_buf::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.851715 INFO  stm32_uart_rx: Received byte: 18
└─ boards::uart::__stm32_uart_reader_task::{async_fn#0} @ src/uart.rs:27
7.851928 INFO  Consuming 1 bytes from buffer
└─ boards::uart::__stm32_uart_reader_task::{async_fn#0} @ src/uart.rs:32
7.852172 TRACE   ringbuf: pop 1
└─ embassy_hal_internal::atomic_ring_buffer::{impl#2}::pop_done @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
7.852447 TRACE irq: start
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.852630 TRACE   irq_rx: rxstarted
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.852813 TRACE   ringbuf: push_buf 64..128
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_buf @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
7.853149 TRACE   irq_rx: starting second 64
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.853393 TRACE   ringbuf: push 64
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_done @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
7.853668 TRACE irq: end
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.853851 TRACE irq: start
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
7.854064 ERROR panicked at 'BufferedUarte overrun'
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:106
7.854217 ERROR panicked at /var/home/sj/.cargo/registry/src/index.crates.io-6f17d22bba15001f/defmt-0.3.5/src/lib.rs:368:5:
explicit panic
└─ panic_probe::print_defmt::print @ /var/home/sj/.cargo/registry/src/index.crates.io-6f17d22bba15001f/panic-probe-0.3.1/src/lib.rs:104
sureshjoshi commented 12 months ago

Working on another replication with 256 bytes buffer, so that way, the incoming message size is never higher than 1/2 the DMA buffer.

Edit: And obviously my previously unstable system has become magically stable 🤦🏽

~Though now that I've added TX's back in, those are failing to send periodically too - so that the subsequent message is 2x the size.~ (Edit: See below)

sureshjoshi commented 12 months ago

EDIT: This TX issue is not a bug, more due to me being a troglodyte. It's working as designed as per the documentation: https://docs.rs/embedded-io-async/latest/embedded_io_async/trait.Write.html#tymethod.write

The solution here was to import use embedded_io_async::Write; and then use write_all instead of write.

~Also, I think I found another quirk. Every 10 messages transmitted on my system, the connected device sees a parsing error, and the following message size is incorrect. I believe this happens on wraparound of the ring buffer:~

~This is on transmit - but I wonder if it's related to the panic on rx in the sense of weirdness around ring buffer rollover. I had a suspicion that was hard to prove that the RX panic on larger buffers happened on a bad rollover (e.g. related to an = or <,=,> situation).~

#[embassy_executor::task]
pub async fn stm32_uart_writer(
    mut uart: BufferedUarteTx<'static, 'static, peripherals::UARTE0, peripherals::TIMER1>,
    mut transmitter: drivers::stm32::UartTransmitter,
) -> ! {
    info!("stm32_uart_writer: Starting...");
    loop {
        let Some(bytes) = transmitter.next_message().await else {
            info!("stm32_uart_writer: Error receiving next message!");
            continue;
        };

        info!(
            "stm32_uart_writer: Sending bytes: {}, {:?}",
            bytes.len(),
            bytes.as_slice()
        );
        // let _ = uart.write(&bytes).await;
        let result = uart.write(&bytes).await;
        match result {
            Ok(len) => {
                info!("stm32_uart_writer: Successfully wrote {} bytes", len);
            }
            Err(e) => {
                info!("stm32_uart_writer: Error writing: {:?}", e);
            }
        }
    }
    // info!("stm32_uart_writer: Done!");
}

Successful TX: Expecting to send 98 bytes, successfully sending 98 bytes.

119.460388 INFO  UartTransmitter::next_message: 2
└─ app::drivers::stm32::{impl#2}::next_message::{async_fn#0} @ /var/home/sj/Developer/foo/myproject-nrf52/app/src/drivers/stm32.rs:99
119.461730 INFO  stm32_uart_writer: Sending bytes: 98, [97, 96, 18, 92, 8, 1, 42, 88, 26, 86, 10, 84, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 118, 216, 0]
└─ boards::uart::__stm32_uart_writer_task::{async_fn#0} @ src/uart.rs:50
119.464447 TRACE poll_write: 98
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_write::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
119.464721 TRACE   ringbuf: push_buf 874..1024
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_buf @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
119.465087 TRACE   ringbuf: push 98
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_done @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
119.465362 TRACE poll_write: queued 98
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_write::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
119.465637 TRACE irq: start
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
119.465850 TRACE   ringbuf: pop_buf 874..972
└─ embassy_hal_internal::atomic_ring_buffer::{impl#2}::pop_buf @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
119.466217 TRACE   irq_tx: starting 98
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
119.466491 TRACE irq: end
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
119.466705 INFO  stm32_uart_writer: Successfully wrote 98 bytes

Failed rollover TX: Only sent 42 bytes!

120.460968 INFO  UartTransmitter::next_message: 2
└─ app::drivers::stm32::{impl#2}::next_message::{async_fn#0} @ /var/home/sj/Developer/foo/myproject-nrf52/app/src/drivers/stm32.rs:99
120.462310 INFO  stm32_uart_writer: Sending bytes: 98, [97, 96, 18, 92, 8, 1, 42, 88, 26, 86, 10, 84, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 216, 0]
└─ boards::uart::__stm32_uart_writer_task::{async_fn#0} @ src/uart.rs:50
120.465026 TRACE poll_write: 98
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_write::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
120.465301 TRACE   ringbuf: push_buf 982..1024
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_buf @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
120.465667 TRACE   ringbuf: push 42
└─ embassy_hal_internal::atomic_ring_buffer::{impl#1}::push_done @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
120.465942 TRACE poll_write: queued 42
└─ embassy_nrf::buffered_uarte::{impl#3}::inner_write::{async_fn#0}::{closure#0} @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
120.466217 TRACE irq: start
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
120.466430 TRACE   ringbuf: pop_buf 982..1024
└─ embassy_hal_internal::atomic_ring_buffer::{impl#2}::pop_buf @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-hal-internal/src/fmt.rs:117
120.466796 TRACE   irq_tx: starting 42
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
120.467071 TRACE irq: end
└─ embassy_nrf::buffered_uarte::{impl#1}::on_interrupt @ /var/home/sj/.cargo/git/checkouts/embassy-9312dcb0ed774b29/b3367be/embassy-nrf/src/fmt.rs:117
120.467254 INFO  stm32_uart_writer: Successfully wrote 42 bytes
sureshjoshi commented 12 months ago

Reproduction here: https://github.com/sureshjoshi/embassy-2181

From the chat room - running in Release mode appears to be more reliable then in debug

sureshjoshi commented 12 months ago

Fixed on the reproduction repo - able to send megabytes of data at 1 byte packets or 128 byte packets, no hangs, no panics.