earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 boards
GNU Lesser General Public License v2.1
1.88k stars 394 forks source link

reboot halts but often USB serial won't reconnect #2223

Closed Gavin-Perry closed 3 weeks ago

Gavin-Perry commented 1 month ago

First off Earle, I'm so impressed with the tremendous amount of great work you (and your team?) have done supporting the PICO in Arduino (and the SDK docs). I couldn't do all my projects with my new (3 years) favorite MPU. Todays problem is that I want to stop the process and reboot in case of some failure (maybe in hardware not code!) I've tried various versions of this when a Serial message of 'x' is received. Rarely it works and I can continue, mostly it hangs waiting for a hard reboot (Ground the RUN button) This is running on Core 1 (Not 0)

        case 'x': // STOP button pressed, stop the trial (need to close valves etc.)
          // pause processor 0 and shut all GPIO off from here, then reboot
            noInterrupts(); // stop interrupts first
            rp2040.idleOtherCore(); // Stop task 
            SetALLlow();  // Now?
// !!! First tone starts and never ends so don't use DEBUG
#ifdef DEBUG
  tone(BUZZER_PIN, Note, 100);  // generates a 100ms  beep note
  delay(200);                   // tone is Non-blocking, so wait
  tone(BUZZER_PIN, Buzz, 100);  // generates a 100ms beep note D5
  delay(110);                   // tone is Non-blocking, so wait
#endif
// Need this or  is reboot enough?
            detachInterrupt(digitalPinToInterrupt(LickLeft));  
            detachInterrupt(digitalPinToInterrupt(LickRight));
            detachInterrupt(digitalPinToInterrupt(SyncIn)); 
            SetALLlow();  // Try again, not all GPIO went to 0

            rp2040.reboot(); // Needs to reliably restart!
          break;

The tone just keeps on going when I added that debug section ISRs are very short just setting flags so no hangs there

I appreciate any help or suggestions. Otherwise it's a big red hardwired button to the RUN pin and no stopping from the monitoring computer!

earlephilhower commented 1 month ago

Can you make a small self-contained sketch that shows the failure?

The reboot call is as simple as can be: it just tells the watchdog to fire after 10us and busy waits. 10us later the WD fires and resets the chip.

https://github.com/earlephilhower/arduino-pico/blob/4ab0ba61333ada306c8d8044883ef89cc0bacdc1/cores/rp2040/RP2040Support.h#L278-L283

FWIW, sometimes the USB port on the PC gets confused if the device resets in the middle of an operation, requiring a full unplug-plug cycle. That's on the host side, not the Pico, so there's nothing really we can do about it (especially since here we're asynchronously rebooting)...

Gavin-Perry commented 1 month ago

I was hoping that disabling interrupts or halting the other processor would help it work. I guessed it has to do with Windoze not always reconnecting to USB after a reboot. When you say middle of an operation you mean like serial communication from pico to host or v.v.? I've tried waiting for that kind of thing to finish. Is Windoze constantly checking if USB device is still there in some way? It does chime on disconnect and connect, in this case just the disconnect. Does it matter if loop1() is calling for the reboot vs loop() So the host side, as you say, but is there anything that can be done other than use a Linux host? I can try making a simple example if that helps.

Gavin


From: Earle F. Philhower, III @.> Sent: Thursday, June 13, 2024 11:43 AM To: earlephilhower/arduino-pico @.> Cc: Gavin Perry @.>; Author @.> Subject: Re: [earlephilhower/arduino-pico] reboot halts but often USB serial won't reconnect (Issue #2223)

Can you make a small self-contained sketch that shows the failure?

The reboot call is as simple as can be: it just tells the watchdog to fire after 10us and busy waits. 10us later the WD fires and resets the chip.

https://github.com/earlephilhower/arduino-pico/blob/4ab0ba61333ada306c8d8044883ef89cc0bacdc1/cores/rp2040/RP2040Support.h#L278-L283

FWIW, sometimes the USB port on the PC gets confused if the device resets in the middle of an operation, requiring a full unplug-plug cycle. That's on the host side, not the Pico, so there's nothing really we can do about it (especially since here we're asynchronously rebooting)...

— Reply to this email directly, view it on GitHubhttps://github.com/earlephilhower/arduino-pico/issues/2223#issuecomment-2166198355, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVR4PP7QEDLISR4EPGS5O5TZHHD2RAVCNFSM6AAAAABJHPUMK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGE4TQMZVGU. You are receiving this because you authored the thread.Message ID: @.***>

earlephilhower commented 1 month ago

In my experience, the Linux USB stack is even more finicky. USB is polled by the host (IIRC) 1000 times per second and if there's nothing to send just reports back that it's idle. Whether you're sending USB serial chars or not, this happens always.

Look at this to see what the USB 1200bps reboot does. It tries to do a safe USB disconnect, but even so it's not 100% foolproof for some folks. But, it's better than simply resetting and hoping...

https://github.com/earlephilhower/arduino-pico/blob/4ab0ba61333ada306c8d8044883ef89cc0bacdc1/cores/rp2040/SerialUSB.cpp#L182-L198

Gavin-Perry commented 1 month ago

Thanks for the tip on how to disconnect USB, maybe that will help. Will that code work in Arduino INO file? I'm so old that I stay away from CPP when I can.


From: Earle F. Philhower, III @.> Sent: Thursday, June 13, 2024 3:57 PM To: earlephilhower/arduino-pico @.> Cc: Gavin Perry @.>; Author @.> Subject: Re: [earlephilhower/arduino-pico] reboot halts but often USB serial won't reconnect (Issue #2223)

In my experience, the Linux USB stack is even more finicky. USB is polled by the host (IIRC) 1000 times per second and if there's nothing to send just reports back that it's idle. Whether you're sending USB serial chars or not, this happens always.

Look at this to see what the USB 1200bps reboot does. It tries to do a safe USB disconnect, but even so it's not 100% foolproof for some folks. But, it's better than simply resetting and hoping...

https://github.com/earlephilhower/arduino-pico/blob/4ab0ba61333ada306c8d8044883ef89cc0bacdc1/cores/rp2040/SerialUSB.cpp#L182-L198

— Reply to this email directly, view it on GitHubhttps://github.com/earlephilhower/arduino-pico/issues/2223#issuecomment-2166782841, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVR4PP3XC3DBTL3VAUT65HLZHIBUFAVCNFSM6AAAAABJHPUMK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWG44DEOBUGE. You are receiving this because you authored the thread.Message ID: @.***>

earlephilhower commented 1 month ago

Yes, you can cut-n-paste it into your sketch

         // Disable NVIC IRQ, so that we don't get bothered anymore 
         irq_set_enabled(USBCTRL_IRQ, false); 
         // Reset the whole USB hardware block 
         reset_block(RESETS_RESET_USBCTRL_BITS); 
         unreset_block(RESETS_RESET_USBCTRL_BITS); 
         // Delay a bit, so the PC can figure out that we have disconnected. 
         busy_wait_ms(3); 
         rp2040.reboot();

It's worth a try and can't really make it any worse, I guess. :)

Gavin-Perry commented 1 month ago

reset_block and unreset_block "aren't declared in this scope" I tried

include // Not needed for 2 core stuff

include "pico/multicore.h" // not needed or helpful

include "hardware/gpio.h" // Nope not here either

Is there an easy way to find header files for undeclared procedures?

I'm also working on rp2040.idleOtherCore(); // pause the other core rp2040.resumeOtherCore(); // restart other code which work for core0 to pause core1 but not the other way around Example program to test them all in process once I find reset_block

earlephilhower commented 1 month ago

Add the includes to the top of the sketch, taken from the SerialUSB.cpp file:

#include <pico/time.h>
#include <pico/binary_info.h>
#include <pico/bootrom.h>
#include <hardware/irq.h>
#include <pico/mutex.h>
#include <hardware/watchdog.h>
#include <pico/unique_id.h>
#include <hardware/resets.h>
Gavin-Perry commented 3 weeks ago

I've now thoroughly tested two version of a program that just tests rebooting and core pausing functions. Tee only difference is swapping which code does what. Sorry if these are a bit long but it provides easy testing of all the functions and results are peculiar. In first case CorePause.ino with core 0 "in charge" my short pause of core0 doesn't work while long pausing core 1 does work. Not surprisingly the reverse happens when the core functions are swapped. So call it rp2040.idleCore1() not rp2040.idleOtherCore()?

Rebooting with hard or with the extra code you provided works the same. - About half the time. Calling the INO files .TXT since .INO isn't a supported type to GIT (yet?!)

CorePause.ino.txt CorePause1.ino.txt

Thank you for your support. Re the reboot I think I'll add a hardware button to the run pin for a truly hard reboot. Still doesn't reconnect every time but Windows better at it.

earlephilhower commented 3 weeks ago

The pauseOtherCore is agnostic to which core it's running on. But you're not allowed to freeze core 0 and expect any USB stuff to work. So if you try to Serial.print() in loop1 after a rp2040.pauseOtherCore you will be a very unhappy camper. USB is core 0 and you just froze it and disabled all IRQs except the resumeOtherCore one.

Core 0 is special and there's no way to reboot it only. It would kill whatever was running in core 1 anyway because core 0 would re-init the whole BSS and DATA sections, overwriting all the static local and globals in the app. Even restarting core 1 is going to give somewhat undefined behavior since static/globals would NOT be reset to app startup.

In general you should have no reason whatsoever to call rp2040.idleOtherCore for your app. Things like mutexes and volatile globals provide a safe way of handshaking between the cores w/o freezing stuff. Only if you are using the flash QSPI interface (i.e. BOOTSEL or writing to flash memory using the raw Pico SDK API) is it needed to use that heavy hammer 🔨 .

Gavin-Perry commented 3 weeks ago

Thanks Earle. As usual you are very helpful and explain things well. I wondered about the asymmetry between core0 and core1. So even if I start serial in core1, it really belongs to core0. I was being too lazy to figure out how to keep the one core from outputting text while the other one is in the middle of it. My thinking now is instead of using the Arduino lazy serial.print() statements I should use printf() and get everything into one statement so that the built-in mutex can work it's magic. Am I on the right track? I was happy for the globals not to be reset with a reboot but I'm going to do it all another way.


From: Earle F. Philhower, III @.> Sent: Wednesday, June 19, 2024 5:28:06 PM To: earlephilhower/arduino-pico @.> Cc: Gavin Perry @.>; Author @.> Subject: Re: [earlephilhower/arduino-pico] reboot halts but often USB serial won't reconnect (Issue #2223)

The pauseOtherCore is agnostic to which core it's running on. But you're not allowed to freeze core 0 and expect any USB stuff to work. So if you try to Serial.print() in loop1 after a rp2040.pauseOtherCore you will be a very unhappy camper. USB is core 0 and you just froze it and disabled all IRQs except the resumeOtherCore one.

Core 0 is special and there's no way to reboot it only. It would kill whatever was running in core 1 anyway because core 0 would re-init the whole BSS and DATA sections, overwriting all the static local and globals in the app. Even restarting core 1 is going to give somewhat undefined behavior since static/globals would NOT be reset to app startup.

In general you should have no reason whatsoever to call rp2040.idleOtherCore for your app. Things like mutexes and volatile globals provide a safe way of handshaking between the cores w/o freezing stuff. Only if you are using the flash QSPI interface (i.e. BOOTSEL or writing to flash memory using the raw Pico SDK API) is it needed to use that heavy hammer 🔨 .

— Reply to this email directly, view it on GitHubhttps://github.com/earlephilhower/arduino-pico/issues/2223#issuecomment-2179525013, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVR4PPYGPCUUTIQMVTATK4DZIIAXNAVCNFSM6AAAAABJHPUMK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZZGUZDKMBRGM. You are receiving this because you authored the thread.Message ID: @.***>

earlephilhower commented 3 weeks ago

Serial is started on core 0 before any setup because without the serial port you can't actually upload another sketch. If the user didn't do a Serial.begin (i.e. the blink.ino example) they'd need to unplug the device, hold down BOOTSEL, and re-plug in to do another sketch.

You can use Serial from both cores and as long as the data is sent in one call it won't get intermixed. If you do multiple prints to make up a line, then it is possible for the other core to come in and print something inbetween them. See the multicore examples. I am not sure off the top of my head how printf handles this. It may dump out byte-by-byte in which case you could use a sprintf(buff, "format blah %d %02x...", ...); Serial.print(buff); to ensure it's all on 1 go.

earlephilhower commented 3 weeks ago

Closing as it looks like we're done here.