embassy-rs / embassy

Modern embedded framework, using Rust and async.
https://embassy.dev
Apache License 2.0
5.43k stars 749 forks source link

STM32 BufferedUart doesn't clear Idle interrupt flag (STM32F412RET6) #1556

Closed petegibson closed 1 year ago

petegibson commented 1 year ago

The interrupt handler for BufferedUart doesn't clear the idle interrupt flag when servicing the idle interrupt, which leads to continuously servicing the ISR, blocking the executor.

I added debug statements to the ISR, ran a separate task to tick every 100ms and then transmitted a single character from the host:

0.100433 INFO  tick
0.200500 INFO  tick
0.300567 INFO  tick
0.400634 INFO  tick
0.500701 INFO  tick
0.600769 INFO  tick
0.700836 INFO  tick
0.773651 DEBUG uart interrupt
0.773712 DEBUG sr.rxne() == true
0.773773 TRACE   ringbuf: push_buf 0..4
0.773864 TRACE   ringbuf: push 1
0.773925 DEBUG sr(r).read().txe() == true
0.773986 DEBUG uart interrupt
0.774047 DEBUG sr.idle() == true
0.774108 DEBUG sr(r).read().txe() == true
0.774169 DEBUG uart interrupt
0.774200 DEBUG sr.idle() == true
0.774261 DEBUG sr(r).read().txe() == true
0.774322 DEBUG uart interrupt
0.774383 DEBUG sr.idle() == true
0.774444 DEBUG sr(r).read().txe() == true

It continuously services the ISR and doesn't allow the "tick" task to run, until a second character is transmitted:

0.920776 DEBUG uart interrupt
0.920837 DEBUG sr.idle() == true
0.939147 DEBUG sr(r).read().txe() == true
0.939208 DEBUG uart interrupt
0.939239 DEBUG sr.rxne() == true <--------------- 2nd character received
0.939300 TRACE   ringbuf: push_buf 1..4
0.939392 TRACE   ringbuf: push 1
0.939483 DEBUG sr.idle() == true
0.939544 DEBUG sr(r).read().txe() == true
0.939605 INFO  tick
0.939666 DEBUG read poll_fn
0.939727 TRACE   ringbuf: pop_buf 0..2
0.939819 TRACE   ringbuf: pop 1
0.939910 DEBUG Poll::Ready(Ok(len));
0.939941 INFO  read result: [97]
0.940063 DEBUG read poll_fn
0.940093 TRACE   ringbuf: pop_buf 1..2
0.940216 TRACE   ringbuf: pop 1
0.940277 DEBUG Poll::Ready(Ok(len));
0.940338 INFO  read result: [97]
0.940490 DEBUG Poll::Pending
1.039642 INFO  tick
1.139709 INFO  tick
1.239776 INFO  tick
1.339843 INFO  tick
1.439910 INFO  tick

At which point the flag is cleared by the next DR read (although I'm not quite sure why another idle interrupt isn't generated following this character).

The datasheet says:

Bit 4 IDLE: IDLE line detected This bit is set by hardware when an Idle Line is detected. An interrupt is generated if the IDLEIE=1 in the USART_CR1 register. It is cleared by a software sequence (an read to the USART_SR register followed by a read to the USART_DR register).

So the interrupt flag should be cleared by reading DR when servicing the idle interrupt:

            if sr.idle() {
                state.rx_waker.wake();
                let _ = rdr(r).read_volatile();
            };

However @Dirbaio suggested in chat that this might lead to a race condition if DR is read to clear the idle flag right when a new character arrives (as this would discard the character and also clear the rxne interrupt flag). This would only occur if the ISR is not being serviced fast enough - equivalent to an overflow condition on two subsequent characters - with the difference that it may not trigger the overflow condition, so the data loss would go undetected.

@Dirbaio concluded that because of this race condition in the hardware, we should instead just disable the idle interrupt.

petegibson commented 1 year ago

Is all that's required for this:

Dirbaio commented 1 year ago

Yes, I think that's a good solution for now. Perf might suffer a bit, but at least it'll work, unlike now.

petegibson commented 1 year ago

It looks like there's also an issue with not clearing the error flags if it occurs without a corresponding rxne. This is sending the string "0123456789" from the host with a 4 byte RX buffer on BufferedUart:

1.501068 INFO  tick
1.601135 INFO  tick
1.701202 INFO  tick
1.801269 INFO  tick
1.901336 INFO  tick
2.001403 INFO  tick
2.101470 INFO  tick
2.201538 INFO  tick
2.230285 INFO  read result: [48]
2.230377 INFO  read result: [49]
2.230499 INFO  read result: [50]
2.230590 WARN  Overrun error
2.230682 INFO  read result: [51]
2.230773 WARN  Overrun error
2.230834 INFO  read result: [53]
2.230957 WARN  Overrun error
2.230987 WARN  Overrun error
2.231048 WARN  Overrun error
2.231109 WARN  Overrun error
2.231170 WARN  Overrun error
2.231231 WARN  Overrun error
2.231292 WARN  Overrun error
2.231353 WARN  Overrun error
2.231384 WARN  Overrun error

I had to move the error warning messages out of the if sr.rxne() conditional to see what was happening. It looks like sometimes the overrun error is accompanied by a rxne interrupt, but sometimes you just get the error - which means the error doesn't get cleared and you get the continuous ISR problem (mentioned here http://efton.sk/STM32/gotcha/g23.html).

The current code only clears the error flags if rxne is set AND if the rx buffer is not full. I think the best approach would be to read the status register and data register as close together as possible (any way to do this atomically?) and then process the SR flags and act accordingly.

Dirbaio commented 1 year ago

oof, damn, it's the same issue as with the IDLE flag :( Stupid design all over.

In this case, since there's been an error, maybe it is OK to do the dummy DR read. It might lose more data, but since we had an error the data is already incomplete anyway.

Actually, there is a solution I hadn't thought about before: instead of clearing the flag by a dummy SR read, disable the xxIE bit. This makes the irq stop firing in a loop. Then, after the next DR read, you know all previous flags have been cleared so you can reenable the xxIE bit.

This would work for IDLE. Not sure about the error flags, it seems there might be a race where you lose errors instead :(

petegibson commented 1 year ago

I've been doing some testing with the different approaches. It seems like the performance hit from waking on every character is fairly significant, at least in my application, and is enough to cause overrun errors (admittedly this is still with defmt enabled though).

Even disabling the idle interrupt until the next read appears to cause more overrun errors at 115200 baud (over just a dummy DR read). All I can think is that this is due to a longer ISR? Here's that code:

        // RX
        unsafe {
            let sr = sr(r).read();
            // Reading DR clears the rxne, error and idle interrupt flags on v1.
            let dr = if sr.ore() || sr.rxne() {
                Some(rdr(r).read_volatile())
            }
            else {
                None
            };
            clear_interrupt_flags(r, sr);

            if sr.idle() && !sr.rxne() {
                // disable idle interrupt until next char received
                r.cr1().modify(|w| {
                    w.set_idleie(false);
                });
            }

            if sr.pe() {
                warn!("Parity error");
            }
            if sr.fe() {
                warn!("Framing error");
            }
            if sr.ne() {
                warn!("Noise error");
            }
            if sr.ore() {
                warn!("Overrun error");
            }
            if sr.rxne() {
                let mut rx_writer = state.rx_buf.writer();
                let buf = rx_writer.push_slice();
                if !buf.is_empty() {
                    buf[0] = dr.unwrap();
                    rx_writer.push_done(1);
                } else {
                    // FIXME: Should we disable any further RX interrupts when the buffer becomes full.
                }

                r.cr1().modify(|w| {
                    w.set_idleie(true);
                });

                if state.rx_buf.is_full() {
                    state.rx_waker.wake();
                }
            }

            if sr.idle() {
                state.rx_waker.wake();
            }
        }

According to an ST Employee here https://community.st.com/s/feed/0D50X00009XkW2nSAF, DMA is the recommended way to get around the race conditions. Though whether the idle interrupt can still be used without race conditions is unclear.

For my application (XMODEM-1K transfers over RS485) the dummy DR read to clear OVE and IDLE with idle interrupt enabled was the only method that didn't drop RX characters.

xoviat commented 1 year ago

The real question is why are you not using dma? I honestly think irq driven buffered uart is a wontfix, and I don't think that's necessarily a problem. If you have problems with DMA, we can try to improve that interface.

Dirbaio commented 1 year ago

The irq-driven buffereduart is still nice because it needs less resources. It works fine for TX, and it can be uesful for RX if you dont need high perf, for example if it's for a terminal for a human to type on.

cumthugo commented 1 year ago

I add dummy DR read on the intterrupt, and it works fine.

if sr.idle() { // This read also clears the error and idle interrupt flags on v1. rdr(r).read_volatile(); state.rx_waker.wake(); };

@petegibson I'm not sure this patch can fix your issue or not.