earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 and RP2350 boards
GNU Lesser General Public License v2.1
2.03k stars 423 forks source link

setup and loop both running? #2309

Closed Gavin-Perry closed 2 months ago

Gavin-Perry commented 2 months ago

Hi Earle, I love your work implementing Pico for Arduino and have been using it for years. This is a problem I can't imaging how it happens. BREAKING NEWS after completing all this description I got it working. Skip to the bottom OMG! Then if you want to see all the headache and think about how to warn against this happening to anyone else.... (Where would it be documented?)

I'm running Windows 10 Arduino IDE 2.3.2 and I just updates the board package to 3.9.4. When this problem started I switched to testing on a new bare Raspberry Pi Pico instead of my board, to avoid possible power issues causing reboots. The device is ultimately going to run with Matlab providing a GUI and the Pico running single trials that Matlab keeps track of. I've added a ton of debug lines to try to find the problem the print text starts with # to separate these lines from legit code to matlab for logging. Both processors are running, one is meant to handle the USB chatting with Matlab while the other handles the trial timing stim and response For now I'm testing with the Serial Monitor in Arduino IDE

So here is what happens (with all the debugger text):

The program is jumping out of setup() in the middle, without finishing! (I can do things in the loop) but also still running setup code! (The LED is fast flashing as I set it to do in setup(), it doesn't in the main code. Pico boots and setup() reports:

Setup

I type the character i Pico replies

1

as it should I then type d to let it know we are good to go The CheckID() routine replies

Connected

but then the rest of the setup routine doesn't run What should happen is

MatLab is Present

Attach Interrupts

Test tones

Setup done at 9185 // msec of runtime so far

Entering Loop

None of that displays, but if I type r (to report parameters as they are set) it correctly answers:

params: ITI 2500, OdorTm 2400, CWTm 2000, LOdor 0, ROdor 0, RLA 0, RLB 0, RRA 0, RRB 0

Weirdly, sometimes it DOES work. Here are 2 tests in a row:

Setup

1

Connected

MatLab is Present

Attach Interrupts

Test tones

Setup done at 8685

Entering Loop But then I get

Help: ML took too long in CheckID but it HAD connected and it's not in setup() any more

ISRs aren't working right

Error: Lick overlap: 22342

Error: Lick overlap: 26113

Help: ML took too long in CheckID But then after is says this

L126543 Now the ISR's are working properly! L127262 L127805

I reboot the Pico and this time

Setup

1

Connected

Nothing after that. ISRs aren't working, i.e. main loop doesn't handle ISRs and yet if I type r I get

params: ITI 2500, OdorTm 2400, CWTm 2000, LOdor 0, ROdor 0, RLA 0, RLB 0, RRA 0, RRB 0

So it jumped into the main loop1

I hate to bother you with such a complicated program and it's so intermittent I don't know if I can duplicate it in a short program.

OMG!! I just swapped which was loop and loop1 to make most of the USB stuff happen in loop and it seems to be working! I'm still sending this out as a warning about using Serial.print in loop1 . I don't know if it's been documented before now though I did know (you wrote to me) that the USB interrupts are handled by processor 0 It didn't seem to matter for the first few weeks on this project. Until it did.

I've attached the program for entertainment purposes. Had to rename it .txt as github doesn't support .INO files. (It does enough others get then to add INO for the Arduino crowd? They don't all know the trick. OdorChoice5.ino.txt

earlephilhower commented 2 months ago

Sorry, there's a lot going on in your INO so I can't really dig into it to figure out what you were seeing.

The Serial port is generally protected by a cross-core mutex which makes it safe to use from either core, assuming that the core which owns the mutex can make forward progress. With FreeRTOS it should also be protected from any number of tasks (but the testing there is much less extensive).

While there may be a deadlock possibility if one core stalls forever while holding the mutex, resulting in no operation of either core, there is no way it could actually cause a program flow change unilaterally (like skipping some of your setup() code) unless there's some if condition that's returning false and the setup() function itself has implemented (i.e. if (Serial.read() < 0) { return; } or something). It's not that subtle or obtuse, thankfully.

It is technically illegal to write to the USB serial port from an interrupt. If your ISR is doing that, please use a different way to get logging (writing to an in-mem log that the main loop() dumps every cycle, etc). ISRs should never block, but USB often blocks. Buffers could be full, the PC could be in the middle of a polling operation, the other core might have the CoreMutex, etc.

FWIW, the beginning USB Serial output sometimes gets dropped by the OS itself. Remember that core0 runs USB and a chip reset essentially unplugs/replugs in the serial port as far as the PC host is concerned. Sometimes it comes up fast, sometimes it takes my Linux box a few seconds to reattach the /dev/ttyACM.

Gavin-Perry commented 2 months ago

Thanks for the help. I don't print out of ISRs, I know better than that. bool CheckID() does return true or false but that shouldn't pop it out of setup. I'm familiar with the tie it takes OS to have a USB ready so I put in delays Swapping the loops seems to have helped but I'm still getting lock up somewhere. I'll see if I can make a smaller program that replicates the problem. I know this app is huge as I've been working on it a long time. It all worked fine until recently and I didn't save enough different versions to know what changed specifically.