Closed petegibson closed 1 year ago
Is all that's required for this:
Yes, I think that's a good solution for now. Perf might suffer a bit, but at least it'll work, unlike now.
It looks like there's also an issue with not clearing the error flags if it occurs without a corresponding rxne. This is sending the string "0123456789" from the host with a 4 byte RX buffer on BufferedUart:
1.501068 INFO tick
1.601135 INFO tick
1.701202 INFO tick
1.801269 INFO tick
1.901336 INFO tick
2.001403 INFO tick
2.101470 INFO tick
2.201538 INFO tick
2.230285 INFO read result: [48]
2.230377 INFO read result: [49]
2.230499 INFO read result: [50]
2.230590 WARN Overrun error
2.230682 INFO read result: [51]
2.230773 WARN Overrun error
2.230834 INFO read result: [53]
2.230957 WARN Overrun error
2.230987 WARN Overrun error
2.231048 WARN Overrun error
2.231109 WARN Overrun error
2.231170 WARN Overrun error
2.231231 WARN Overrun error
2.231292 WARN Overrun error
2.231353 WARN Overrun error
2.231384 WARN Overrun error
I had to move the error warning messages out of the if sr.rxne()
conditional to see what was happening. It looks like sometimes the overrun error is accompanied by a rxne interrupt, but sometimes you just get the error - which means the error doesn't get cleared and you get the continuous ISR problem (mentioned here http://efton.sk/STM32/gotcha/g23.html).
The current code only clears the error flags if rxne is set AND if the rx buffer is not full. I think the best approach would be to read the status register and data register as close together as possible (any way to do this atomically?) and then process the SR flags and act accordingly.
oof, damn, it's the same issue as with the IDLE flag :( Stupid design all over.
In this case, since there's been an error, maybe it is OK to do the dummy DR read. It might lose more data, but since we had an error the data is already incomplete anyway.
Actually, there is a solution I hadn't thought about before: instead of clearing the flag by a dummy SR read, disable the xxIE bit. This makes the irq stop firing in a loop. Then, after the next DR read, you know all previous flags have been cleared so you can reenable the xxIE bit.
This would work for IDLE. Not sure about the error flags, it seems there might be a race where you lose errors instead :(
I've been doing some testing with the different approaches. It seems like the performance hit from waking on every character is fairly significant, at least in my application, and is enough to cause overrun errors (admittedly this is still with defmt enabled though).
Even disabling the idle interrupt until the next read appears to cause more overrun errors at 115200 baud (over just a dummy DR read). All I can think is that this is due to a longer ISR? Here's that code:
// RX
unsafe {
let sr = sr(r).read();
// Reading DR clears the rxne, error and idle interrupt flags on v1.
let dr = if sr.ore() || sr.rxne() {
Some(rdr(r).read_volatile())
}
else {
None
};
clear_interrupt_flags(r, sr);
if sr.idle() && !sr.rxne() {
// disable idle interrupt until next char received
r.cr1().modify(|w| {
w.set_idleie(false);
});
}
if sr.pe() {
warn!("Parity error");
}
if sr.fe() {
warn!("Framing error");
}
if sr.ne() {
warn!("Noise error");
}
if sr.ore() {
warn!("Overrun error");
}
if sr.rxne() {
let mut rx_writer = state.rx_buf.writer();
let buf = rx_writer.push_slice();
if !buf.is_empty() {
buf[0] = dr.unwrap();
rx_writer.push_done(1);
} else {
// FIXME: Should we disable any further RX interrupts when the buffer becomes full.
}
r.cr1().modify(|w| {
w.set_idleie(true);
});
if state.rx_buf.is_full() {
state.rx_waker.wake();
}
}
if sr.idle() {
state.rx_waker.wake();
}
}
According to an ST Employee here https://community.st.com/s/feed/0D50X00009XkW2nSAF, DMA is the recommended way to get around the race conditions. Though whether the idle interrupt can still be used without race conditions is unclear.
For my application (XMODEM-1K transfers over RS485) the dummy DR read to clear OVE and IDLE with idle interrupt enabled was the only method that didn't drop RX characters.
The real question is why are you not using dma? I honestly think irq driven buffered uart is a wontfix, and I don't think that's necessarily a problem. If you have problems with DMA, we can try to improve that interface.
The irq-driven buffereduart is still nice because it needs less resources. It works fine for TX, and it can be uesful for RX if you dont need high perf, for example if it's for a terminal for a human to type on.
I add dummy DR read on the intterrupt, and it works fine.
if sr.idle() { // This read also clears the error and idle interrupt flags on v1. rdr(r).read_volatile(); state.rx_waker.wake(); };
@petegibson I'm not sure this patch can fix your issue or not.
The interrupt handler for BufferedUart doesn't clear the idle interrupt flag when servicing the idle interrupt, which leads to continuously servicing the ISR, blocking the executor.
I added debug statements to the ISR, ran a separate task to tick every 100ms and then transmitted a single character from the host:
It continuously services the ISR and doesn't allow the "tick" task to run, until a second character is transmitted:
At which point the flag is cleared by the next DR read (although I'm not quite sure why another idle interrupt isn't generated following this character).
The datasheet says:
So the interrupt flag should be cleared by reading DR when servicing the idle interrupt:
However @Dirbaio suggested in chat that this might lead to a race condition if DR is read to clear the idle flag right when a new character arrives (as this would discard the character and also clear the rxne interrupt flag). This would only occur if the ISR is not being serviced fast enough - equivalent to an overflow condition on two subsequent characters - with the difference that it may not trigger the overflow condition, so the data loss would go undetected.
@Dirbaio concluded that because of this race condition in the hardware, we should instead just disable the idle interrupt.