earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 and RP2350 boards
GNU Lesser General Public License v2.1
2.05k stars 427 forks source link

UART-communication timing out since 1.9.9 -> 1.9.10 #468

Closed Ing-Dom closed 2 years ago

Ing-Dom commented 2 years ago

I just realized that there is some problem with the UART in 1.10.0 and could track it down to 1.9.10 - there it fails, where 1.9.9 works good.

I analyzed SerialUART.cpp and tried to understand it - I have one question regarding this comment https://github.com/earlephilhower/arduino-pico/blob/master/cores/rp2040/SerialUART.cpp#L246

with FIFO you mean the Hardware-FIFO ? does that mean when a byte arrives while the hardware-FIFO is has 7 or less bytes (32byte / 4 = 8byte) in it the byte is NOT transfered from HW-FIFO to the _queue and can therefor not be read?

that would explain my pains - maybe.

the byte-Timeout of the knx stack is 10ms. If the stack is in the middle of receiving a telegram, and no byte is received for 10ms, it is dropped.

But, at a baudrate of 19200 8E1 32 bit time should be about 1,7ms. So IF there is something in the HW-FIFO, it the ISR should be called at least after 1.7ms..

If I increase the byte timeout (just for testing, has side effects) to 15ms, it works.

earlephilhower commented 2 years ago

What I meant by the comment: // IRQ handler, called when FIFO > 1/4 full or when it had held unread data for >32 bit times was that the UART will trigger an interrupt if the FIFO is > 1/4 full (i.e. 2bytes like you said) or it has been 32 UART bit-times with <= 1/4 full w/o any additional bits coming in.

If you are timing out in less than the time it takes to send ~ 4 bytes (it's actually less than that because it takes ~10 bit times to send 1 byte w/start and stop bits) then yes, your app could misbehave. At 115.2K that's 32 / 115200 == < 1ms. At 19200 that's32 / 19200` == ~2ms, as you said.

There's a 1-line change to adjust that value to be "any data at all causes IRQ" by setting RXIFLSEL to 0 in init. That removes any delay but will cause 3x the interrupts in many cases. For 115.2K, no problem. Running at 1M it might be a bit ugly.

Let me review the code today and post a PR. There will be 1 file you should be able to just place over your installed copy to test things out. Since you have a failing case, your testing later would be much appreciated!

Ing-Dom commented 2 years ago

There's a 1-line change to adjust that value to be "any data at all causes IRQ" by setting RXIFLSEL to 0 in init. That removes any delay

are you sure? from the datasheet

Receive interrupt FIFO level select. The trigger points for the receive interrupt are as follows: b000 = Receive FIFO becomes >= 1 / 8 full b001 = Receive FIFO becomes >= 1 / 4 full b010 = Receive FIFO becomes >= 1 / 2 full b011 = Receive FIFO becomes >= 3 / 4 full b100 = Receive FIFO becomes >= 7 / 8 full b101-b111 = reserved

=> 1/8 full - so the 4th byte will trigger

Let me review the code today and post a PR.

should be that line you mean right? uart_get_hw(_uart)->ifls &= ~UART_UARTIFLS_RXIFLSEL_BITS;

btw, I just saw you removed that delay from read - I was pretty shocked when I saw that delay(1)...

earlephilhower commented 2 years ago

Good catch. I was confusing the FIFO sizes (the PIO SMs each have 8 total entries, but the UART has 32 bytes). Right now it's using the POR defaults of 0x2 in that field so 1/2 full or 16 bytes which does seem too much!

In any case, I'll look at it more thoroughly and post a PR. There is also some use of the divider that worries me a bit, too, as well as a very slim chance of memory reordering causing invalid data to be read. The PR will have all changes in SerialUART.cpp which you can then overwrite in your local install and try out the changes.

Ing-Dom commented 2 years ago

I just added that uart_get_hw(_uart)->ifls &= ~UART_UARTIFLS_RXIFLSEL_BITS; in begin and now it works. I modified my 1.9.10 installation, but that shouldn't matter.

What do you think about bypassing the queue and disabling isr when setFIFOSize(0) ?

earlephilhower commented 2 years ago

Good debug. Having another set of eyes on the code is always helpful!

IMHO:

I spotted a few spots which would be issues (but which I really doubt are) that I should also fix. Stay Tuned!

Ing-Dom commented 2 years ago

I just woke up with an idea in my mind... what about a "hybrid" solution? Use the IRQ to prevent the hw-fifo to overflow, but give the user the possibility (implicitly) to poll very fast.

int SerialUART::read() {
    CoreMutex m(&_mutex);
    if (!_running || !m) {
        return -1;
    }
    if (_writer != _reader) {
        auto ret = _queue[_reader];
        _reader = (_reader + 1) % _fifoSize;
        return ret;
    }
    else {
        // if the RAM-FIFO is empty, there may be some bytes in the HW-FIFO that not reached the trigger point yet.
        // if that is the case, read is likely called high frequency and the user can afford it. If it's empty, it's cheap anyway.
        if(uart_is_readable(_uart)) {
            noInterrupts();             // maybe there is a better solution to prevent _handleIRQ called at THIS moment any maybe scramble bytes around.
            //irq_set_enabled(_uart == uart0 ? UART0_IRQ : UART1_IRQ, false);   // this could be one
            char c = uart_getc(_uart);
            //irq_set_enabled(_uart == uart0 ? UART0_IRQ : UART1_IRQ, true);
            interrupts();
            // think about what is happening when _handleIRQ is triggered at this point - can the bytes somehow overtake "c" with serialevent ?
            return c;
    }
    return -1;
}
earlephilhower commented 2 years ago

orig

That's funny, I fell asleep with the same thought, mostly. On this multicore chip noInterrupts can stop this core's interrupts but not the remote one, so no dice there.

I am pretty sure I can, however, cause a serial IRQ via the IRQ controller if the FIFO's not empty and that will make whatever core registered the IRQ run it. Busy-wait while the irq possibly runs on another core and updates the FIFO as normal. Then re-run and you're guaranteed to get something. I'll give that a whirl.

updated

Well, no luck with the setting IRQ idea.

  1. The UART's IRQ set register is not writable, so I can't just write a 1 to the proper bit and cause the IRQ to get taken by the right core.
  2. The irq_set_pending call also won't work as it can only affect the current core's NVIC (IRQ director).

Protecting the FIFO with a CoreMutex should work, though. No longer lockless, which is less than optimal.

earlephilhower commented 2 years ago

@SirSydom can you give #471 a try? It's basically just moving the FIFO limit down to a more reasonable number. We can see about heroics for to skip the 32b timeout in another PR because it looks seriously involved to do it right.

Ing-Dom commented 2 years ago

471 is working. thumbsup

mutex in the isr, gnah, I don't know....

If there is no good hybrid way, then I would go for disabling interrupt-driven reading with setFIFOSize(0).

earlephilhower commented 2 years ago

I'll open a new bug about the non-FIFO mode. Small PRs make it easier to bisect when I break things later on. :)