earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 and RP2350 boards
GNU Lesser General Public License v2.1
2.09k stars 434 forks source link

Watchdog restarting prematurely on PICO W #1229

Closed manlyfoodcoop closed 1 year ago

manlyfoodcoop commented 1 year ago

Thanks so much for this PICO implementation - this is my first time programming against a microprocessor and you have made it so easy! So thanks heaps!

I want to take advantage of the watch dog so added a call to: rp2040.wdt_begin(60000); to setup() but the PICO W boots after a few seconds. I have tried placing the call in the main loop and have tried larger numbers but it still boots shortly after starting.

Any help appreciated!

maxgerhardt commented 1 year ago

So the function is

https://github.com/earlephilhower/arduino-pico/blob/b7912f182bcb01672b8cc856bcaaf42476e981e4/cores/rp2040/RP2040Support.h#L324-L326

and you call rp2040.wdt_begin(60000);.

How can a value of 60000 ms be allowed? Documentation says

delay_ms | Number of milliseconds before watchdog will reboot without watchdog_update being called. Maximum of 0x7fffff, which is approximately 8.3 seconds

maxgerhardt commented 1 year ago

Okay this is weird. If the argument truly is in milliseconds, then "Maximum of 0x7fffff" is 8388607 milliseconds, which is 8388 seconds, which is 139 minutes. Something is not right here. Or something is measured in clock cycles instead of milliseconds.

earlephilhower commented 1 year ago

This is kind of a mixup w/the SDK docs and the HW. Looking at the SDK code, it takes delay_ms and multiplies by 1000 before setting the WDT counter register.

The parameter should be a int16_t because 7fff really is the max milliseconds (the HW seems to have a bug and counts by 2 every tick).

https://github.com/raspberrypi/pico-sdk/blob/f396d05f8252d4670d4ea05c8b7ac938ef0cd381/src/rp2_common/hardware_watchdog/watchdog.c#L55-L63

I don't know what clock the WDT is running off of, but it seems like a 1MHZ is implied according to the SDK docs. So no idea where 8.3secs comes from (but it is, again, in the docs where it implies the WDT runs at a 1MHZ clock?!)

Assuming the WDT clock is core clock, then 8.3MS @ 125MHZ (ROM clock speed) =3dd6fe60,and multiplying by 2 you're just under 7fff_ffff. So maybe WDT is set to run off of core clocks...

So, @manlyfoodcoop , since you seem to have a good use for the WDT, would you be able to do some experiments and report back? Set WDT to 5000(i.e. nominally 5 seconds) and report back on the actual time it takes to reboot? Empirically we can use that to adjust the docs/code/etc.

manlyfoodcoop commented 1 year ago

Thanks for your quick response. I tried the following code below but increasing the value from 10,000 to 1,000,000 results in the same result (a reboot after roughly 8 seconds)....

void setup() { u_int32_t x = 1000000; // values of 10,000 and 1,000,000 both result in the loop finishing at "33" (approx 8 seconds)

delay(5000); // allow me time to connect serial monitor printf("setting WDT with %i\r\n", x);

rp2040.wdt_begin(x);

}

void loop() { int loop=0;

while(1) { printf("loop %d\r\n", loop++ ); delay(250); }

}

On Mon, Feb 27, 2023 at 7:08 AM Earle F. Philhower, III < @.***> wrote:

This is kind of a mixup w/the SDK docs and the HW. Looking at the SDK code, it takes delay_ms and multiplies by 1000 before setting the WDT counter register.

The parameter should be a int16_t because 7fff really is the max milliseconds (the HW seems to have a bug and counts by 2 every tick).

https://github.com/raspberrypi/pico-sdk/blob/f396d05f8252d4670d4ea05c8b7ac938ef0cd381/src/rp2_common/hardware_watchdog/watchdog.c#L55-L63

I don't know what clock the WDT is running off of, but it seems like a 1MHZ is implied according to the SDK docs. So no idea where 8.3secs comes from (but it is, again, in the docs where it implies the WDT runs at a 1MHZ clock?!)

Assuming the WDT clock is core clock, then 8.3MS @ 125MHZ (ROM clock speed) =3dd6fe60,and multiplying by 2 you're just under 7fff_ffff. So maybe WDT is set to run off of core clocks...

So, @manlyfoodcoop https://github.com/manlyfoodcoop , since you seem to have a good use for the WDT, would you be able to do some experiments and report back? Set WDT to 5000(i.e. nominally 5 seconds) and report back on the actual time it takes to reboot? Empirically we can use that to adjust the docs/code/etc.

— Reply to this email directly, view it on GitHub https://github.com/earlephilhower/arduino-pico/issues/1229#issuecomment-1445455190, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASINTHLBNFST35TYXMAKKSDWZOZ5XANCNFSM6AAAAAAVIJWLAA . You are receiving this because you were mentioned.Message ID: @.***>

earlephilhower commented 1 year ago

@manlyfoodcoop thanks for the report.

I did my own testing and the call is correct AFAICT. The upper limit, however, is capped at ~8.3 seconds by the HW. So if I run

void setup() {
delay(5000); // allow me time to connect serial monitor
}

void loop() {
  int loop=0;
  uint32_t x = 8000; // values of 10,000 and 1,000,000 both result in the loop finishing at "33" (approx 8 seconds)
  rp2040.wdt_begin(x);
  while(1) {
    Serial.printf("%d\r\n", loop++ );
    delay(100);
  }
}

I get up to 79 (i..e 8s) befoire reboot. And if I go to x = 1000 I get to 9 (i.e. 1s).

So, the core call and SDK is correct as I see it. Pass in 1000 and you get 1s between WDT resets. It's just a matter of the limits to be documented properly as far as I can see.

I've updated the docs and I think we're good here. Thanks for helping clarify this!

manlyfoodcoop commented 1 year ago

Thanks - I will look at using the second core to create a watch dog process that can wait longer (I want around 5 minutes before deciding things have gone wrong)

On Mon, Feb 27, 2023 at 1:56 PM Earle F. Philhower, III < @.***> wrote:

@manlyfoodcoop https://github.com/manlyfoodcoop thanks for the report.

I did my own testing and the call is correct AFAICT. The upper limit, however, is capped at ~8.3 seconds by the HW. So if I run

void setup() { delay(5000); // allow me time to connect serial monitor }

void loop() { int loop=0; uint32_t x = 8000; // values of 10,000 and 1,000,000 both result in the loop finishing at "33" (approx 8 seconds) rp2040.wdt_begin(x); while(1) { Serial.printf("%d\r\n", loop++ ); delay(100); } }

I get up to 79 (i..e 8s) befoire reboot. And if I go to x = 1000 I get to 9 (i.e. 1s).

So, the core call and SDK is correct as I see it. Pass in 1000 and you get 1s between WDT resets. It's just a matter of the limits to be documented properly as far as I can see.

I've updated the docs and I think we're good here. Thanks for helping clarify this!

— Reply to this email directly, view it on GitHub https://github.com/earlephilhower/arduino-pico/issues/1229#issuecomment-1445613313, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASINTHJR5BH6ZCSUOWVJTTDWZQJULANCNFSM6AAAAAAVIJWLAA . You are receiving this because you were mentioned.Message ID: @.***>

manlyfoodcoop commented 1 year ago

For anyone who ends up here and needs a watchdog that operates above 8 seconds, I used the second core like this:

void setup1()  // using 1 after setup and loop indicates that this runs on second core. How easy is that!!!
{
    delay(5000); // give other core time to startup
    rp2040.wdt_begin(8300);  // only get 8 seconds from this onboard WDT

}

void loop1() {  // loop for second core
 uint32_t x;
int WatchdogLoop=0;

    while (1)
        {
          WatchdogLoop++;
          while (rp2040.fifo.available()) // see if any food in the fifo queue
          {
              rp2040.fifo.pop_nb(&x);
              WatchdogLoop=0;  // something there so core0 is ok
          }

          if (WatchdogLoop > 250) // four minute warning  
              printf("2023 - watch dog warning count: %d\r\n",WatchdogLoop);

          if (WatchdogLoop < 300)  // while under 5 minutes, keep reporting to hardware watchdog that we are OK
              rp2040.wdt_reset();

          delay(1000);  // wait one second

        }
}

Then on your other core, report things are OK by calling this function:


void FeedTheDog() { // report to watchdog that we are working OK

     rp2040.fifo.push_nb(0); // push to other core via fifo queue

}