earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 and RP2350 boards
GNU Lesser General Public License v2.1
2.03k stars 421 forks source link

enableDoubleResetBootloader interferes with the OTA Reboot sequence #1888

Closed johngavel closed 10 months ago

johngavel commented 10 months ago

I had the rp2040.enableDoubleResetBootloader() method in my code. I was then trying to enable OTA upgrades similar to the OTAFromFile example. However the reboot gets stuck in the USB Boot mode.

My supposition is that the double reset is hit first my the example code, then by the OTA boot loader completing the upgrade. Leading to my upgrade being stuck in the USB Boot mode.

First recommendation: Remove the rp2040.enableDoubleResetBootloader() from my code. --> Problem solved. Second recommendation: Add an rp2040.disableDoubleResetBootloader() to the arduino-pico library. --> I don't know how to do this. Third recommendation: Turn off the double reset bootloader code when OTA. --> This feels like too much work for such a niche problem.

Thanks, John

PS: I'm sorry if this does not meet any guidelines or protocols. I have just signed up for this today. Thanks again!

// This example overwrites itself with a serial blinker sketch using OTA
// In general, you will get a file from the Internet, over a serial port, etc.
// and not include it in a header like we do here for simplicity.
//
// You need to have at least 256K of filesystem configured in
// Tools->Flash Size
//
// Released to the public domain August 2022 by Earle F. Philhower, III

#include <PicoOTA.h>
#include <LittleFS.h>
#include "blink_100_1000.h"

void setup() {
  rp2040.enableDoubleResetBootloader();
  Serial.begin(115200);
  delay(5000);

  Serial.printf("Writing OTA image of blinker...");
  LittleFS.begin();
  File f = LittleFS.open("blink.bin", "w");
  if (sizeof(blink) != f.write(blink, sizeof(blink))) {
    Serial.printf("Unable to write OTA binary.  Is the filesystem size set?\n");
    return;
  }
  f.close();
  Serial.printf("done\n\n");
  Serial.printf("Programming OTA commands...");
  picoOTA.begin();
  picoOTA.addFile("blink.bin");
  picoOTA.commit();
  LittleFS.end();
  Serial.printf("done\n\n");
  Serial.printf("Rebooting in 5 seconds, should begin blinker instead of this app...\n");
  delay(5000);
  rp2040.reboot();
}

void loop() {
}
earlephilhower commented 10 months ago

Interesting problem and thanks for giving an MCVE that's easy to understand.

First recommendation: Remove the rp2040.enableDoubleResetBootloader() from my code. --> Problem solved.

👍

Second recommendation: Add an rp2040.disableDoubleResetBootloader() to the arduino-pico library. --> I don't know how to do this.

That's not actually possible. The call isn't doing anything other than making sure a "fake constructor" routine is called which does the magic to make this happen is linked in. It's executed before any general user code and not actually doing something like installing an exception/IRQ handler. https://github.com/raspberrypi/pico-sdk/blob/master/src/rp2_common/pico_bootsel_via_double_reset/pico_bootsel_via_double_reset.c

Third recommendation: Turn off the double reset bootloader code when OTA. --> This feels like too much work for such a niche problem.

This would be possible very simply, if the address of magic_location was available outside of the SDK. A new OTA command (there's room for lots of them) could just poison the magic_tokens there such that the main app's constructor call to the checker would think it was not a 2nd reset. Unfortunately, that variable is a local static and not available to the app to pass to OTA: https://github.com/raspberrypi/pico-sdk/blob/6a7db34ff63345a7badec79ebea3aaef1712f374/src/rp2_common/pico_bootsel_via_double_reset/pico_bootsel_via_double_reset.c#L44

Given that limitation, I'm not sure there's anything we can do here about it. I'll leave this open to see if anyone else has any ideas...

earlephilhower commented 10 months ago

@johngavel can you explain the exact sequence to reproduce the failure? I just realized that I override the SDK's double-reset code anyway and know exactly where the magic token is stored. But I can't reproduce the failure you described. Running the core in the first post it just updates and starts the blink/BLINK code...

johngavel commented 10 months ago

I have a distributed code base from what I was trying to simplify. I thought I did get that to replicate, but now I am not. It currently does replicate in my larger code base and I will try to simplify it down to just the issue.

My original code base includes several other libraries and accessories. I have the debug port on Serial1, using both cores, a w5500 wired ethernet controller, an SSD1306 OLED display, a seperate EEPROM, DHT11 sensor, and a GPIO Expander.

I put the enable double reset bootloader as the first line the setup code. I have a webserver that receives the file via a HTTP post. I save the file to the little fs with a fixed name. I then call the picoOTA commands and turn off my watchdog resets. This will cause a watchdog reboot in 8.3 seconds.

However after the reboot my system comes up in the USB boot mode and hangs.

Thanks, John

Main code base is located here: https://github.com/johngavel/PicoPowerSwitch/

Startup code was in setup as the first line called:

void setup() {
  rp2040.enableDoubleResetBootloader();
  startupMutex.take();
  Wire.begin();

I was then uploading the file file to my own http server via a post command and saving it to the LittleFS.

        case FILE_CONTENTS:
          if (receiveFile(watchdog, &uploadFile, fileLength)) state = UPLOAD_DONE;
          else state = ERROR_STATE;
          uploadFile.close();
          break;

This processes the incoming file and saves it while also petting the watchdog.

bool PowerServer::receiveFile(Watchdog* watchdog, File* file, unsigned int bytes) {
  Task timeout;
  unsigned int total = 0;

  timeout.setRefresh(1000);
  memset(fileBuffer, 0, BUFFER_SIZE);
  unsigned int receivedBytes = 0;
  while ((total < bytes) && !timeout.run()) {
    receivedBytes = client.read((uint8_t*)fileBuffer, BUFFER_SIZE);
    if ((total + receivedBytes) > bytes)
      receivedBytes = bytes - total;
    receivedBytes = file->write(fileBuffer, receivedBytes);
    total += receivedBytes;
    if (receivedBytes > 0) timeout.reset();
    memset(fileBuffer, 0, BUFFER_SIZE);
    watchdog->petWatchdog();
    delay(10);
  }
  debug("Received File: ");
  debug(String(total));
  debug("/");
  debugln(String(bytes));
  return (total == bytes);
}

Sends a HTML web page to the client who sent me the file. Actually closing the client happens later. Sets the watchdog so that it will not be petted in the future. This will cause a reboot in 8.3 seconds.

      if (state == UPLOAD_DONE) {
        sendProcessHTMLPage(&html, ethernet, "Processing File Upgrade....", 20);
        UPGRADE_SYSTEM();
        watchdog->reboot();

Setup the pico ota segments for the programming.

void UPGRADE_SYSTEM() {
  debugln();
  debugln("Starting Upgrade Process.....");
  picoOTA.begin();
  picoOTA.addFile(UPGRADE_FILE_NAME);
  picoOTA.commit();
  LittleFS.end();
  debugln("Reboot in progress.....");
}
earlephilhower commented 10 months ago

Unfortunately if it can't be repro'd there's not much to be done other than think it through.

However after the reboot my system comes up in the USB boot mode and hangs.

Is it in USB boot or is it hung? And if you powercycle things, does it come up properly (and with the new code)?

Where the call is placed makes 0 difference, it is there only to link in a function that the CRT will call before main (and well before setup) will be called.

All the code is in here: https://github.com/earlephilhower/arduino-pico/blob/3068cd0af004e2575f9d47b0851f3abab64124dc/cores/rp2040/sdkoverride/pico_bootsel_via_double_reset.c#L49-L87

On a reset the Pico runs ROM->Boot2->OTA->App, and only the app would have this code linked in.

    This function is registered to run automatically before main(). The
    algorithm is:

     1. Check for magic token in memory; enter BOOTSEL mode if found.
     2. Initialise that memory with that magic token.
     3. Do nothing for a short while (few hundred ms).
     4. Clear the magic token.
     5. Continue with normal boot.

Of note is that when the main app starts up the magic is already 0. When you call the WDT reset, the magic token is gone, it's cleared out before the main is ever called. Only place it could ever be the magic values is within this function, for 350ms during app startup. So, your HW must be resetting twice while in the app and inside this 350ms delay during startup.