raspberrypi / pico-sdk

BSD 3-Clause "New" or "Revised" License
3.7k stars 918 forks source link

Sleep on one core negatively impacts execution on the other core #873

Open JonnyHaystack opened 2 years ago

JonnyHaystack commented 2 years ago

Hi, I originally opened an issue for this in arduino-pico (see https://github.com/earlephilhower/arduino-pico/issues/637) before discovering the issue happens even using the Pico SDK directly. There are more details in that issue, but basically calling sleep_us() on one core while the other core is working seems to have a huge impact on the other core that interrupts its execution part way and is causing communication failures for me even though I'm using the PIO for my comms and it's 100% stable when I'm not using multicore.

Code to reproduce:

#include <pico/multicore.h>
#include <pico/stdlib.h>

bool gp1;

void blah();

int main(void) {
    gpio_init(1);
    gpio_init(2);
    gpio_set_dir(1, GPIO_OUT);
    gpio_set_dir(2, GPIO_OUT);

    multicore_reset_core1();
    multicore_launch_core1(blah);

    while (true) {
        gp1 = !gp1;
        gpio_put(1, gp1);
    }
}

void blah() {
    while (1) {
        gpio_put(2, 1);
        sleep_us(50);
        gpio_put(2, 0);
    }
}

Every time core 1 is about to come out of a sleep, core 0 appears to "stall" in some way. It's not exactly a stall, because it doesn't just continue where it left off. Instead, it actually never does things that it should have, or gets interrupted halfway through doing something and never finishes it.

Logic analyzer trace of the above code:

image

I just tested this with core 0 and core 1's roles swapped and the result is not exactly the same but there is clearly still some impact:

#include <pico/multicore.h>
#include <pico/stdlib.h>

bool gp1;

void blah();

int main(void) {
    gpio_init(1);
    gpio_init(2);
    gpio_set_dir(1, GPIO_OUT);
    gpio_set_dir(2, GPIO_OUT);

    multicore_reset_core1();
    multicore_launch_core1(blah);

    while (true) {
        gpio_put(2, 1);
        sleep_us(50);
        gpio_put(2, 0);
    }
}

void blah() {
    while (1) {
        gp1 = !gp1;
        gpio_put(1, gp1);
    }
}

image

kilograham commented 2 years ago

sleep_us will use an alarm in the default alarm pool (which is on core 0) to wake itself up after a period of time (it can stay ni a lower power state). 4us which you see above is presumably the alarm IRQ handler.

You can use busy_wait_us instead...

JonnyHaystack commented 2 years ago

I'm still somewhat confused how this is causing more issues than the USB timer alarm in arduino-pico. Like, I'll be in an pio_sm_put_blocking() on core 0 and somehow that can get interrupted halfway and never finishes. Even with interrupts it should still continue what it was doing before, right? If it did, it probably wouldn't be an issue, but in reality it's completely messing up what core 0 is doing.

And that workaround is a bit of a problem when I'm using libraries which are using sleep and I need to use them with multicore :-/

kilograham commented 2 years ago

The delay you see in your traces is expected. The fact that it is causing your real use case problems is surprising, unless your use case cannot handle the brief delay.

Indeed, the IRQ should have no effect on the interrupted code (other than the delay)... are you sure something else isn't happening (e.g. stack overflow - possibly during the exception handler)

JonnyHaystack commented 2 years ago

Ah it looks like it's not actually being interrupted while sending the response, but it's actually sending a shorter response than expected because it thinks it received a different command. The way core 1 is affecting core 0 somehow causes it to misread the received command byte. So, sometimes it misses responding to a poll command completely (probably because it looked like an invalid command), and sometimes it thinks it received a probe command and sends the wrong response (3 bytes instead of 8). But this doesn't really make sense to me either, because the actual reading of bits is handled by the PIO so it should not be affected unless the system clock speed is varying. If it was missing whole bytes that would be more explainable, but that doesn't seem to be the case here.

Example of correct behaviour (received poll command and responded correctly with 8 bytes of data): image

Example of received poll command but no response (only happening when using sleep on core 1): image

Example of received poll but responded with only 3 bytes (also only happens when using sleep on core 1): image

I've been able to do plenty of other things on core 1 with the same code running on core 0 and it hasn't been a problem, but anything with a sleep completely breaks it.

lurch commented 2 years ago

@JonnyHaystack Looking at https://github.com/raspberrypi/pico-sdk/blob/master/src/common/pico_time/time.c#L375 do you want to try setting PICO_TIME_DEFAULT_ALARM_POOL_DISABLED to 1, and seeing if that makes your code behave differently?

JonnyHaystack commented 2 years ago

@lurch unfortunately that didn't work at all. Seems to result in the signal from core 0 staying high forever and the signal from core 1 stays low forever. Also for my actual use case I'm using arduino-pico, which won't even compile if I set that flag (error below).

.pio\libdeps\pico\Adafruit TinyUSB Library\src\arduino\ports\rp2040\Adafruit_TinyUSB_rp2040.cpp: In function 'void setup_periodic_usb_hanlder(uint64_t)':
.pio\libdeps\pico\Adafruit TinyUSB Library\src\arduino\ports\rp2040\Adafruit_TinyUSB_rp2040.cpp:84:3: error: 'add_alarm_in_us' was not declared in this scope
   84 |   add_alarm_in_us(us, timer_task, NULL, true);
      |   ^~~~~~~~~~~~~~~