Closed nigelb closed 3 years ago
Thanks for the report. This sounds like a pretty big issue. Testing now.
Im at 3300 right now on the latest release-candidate with no hang.... interesting.... I used your exact sketch. I am switching now to strait 2.0.6.
Count: 3300 3303302
Count: 3301 3304303
Count: 3302 3305304
Count: 3303 3306305
I am not getting this failure on v2.0.6
Count: 2741 2743742
Count: 2742 2744743
Count: 2743 2745744
Count: 2744 2746745
Lets try to find the difference in what I am doing vs you.
Board: Sparkfun RedBoard Artemis ATP Arduino Version: 1.8.13 Sparkfun Apollo3 Boards version: 2.0.6 Operating System: Win10 Terminal Program: Arduino Serial Monitor
Is it possible you are having a problem with your serial monitor?
I am running this sketch unedited
#define BAUD 115200 // any number, common choices: 9600, 115200, 230400, 921600
#define CONFIG SERIAL_8N1 // a config value from HardwareSerial.h (defaults to SERIAL_8N1)
int count = 0;
void setup() {
Serial.begin(BAUD); // set the baud rate with the begin() method
Serial.println("\n\nApollo3 - Serial");
}
void loop() {
Serial.print("Count: ");
Serial.print(count++);
Serial.print(" ");
Serial.println(millis());
delay(1000);
}
Were you having problems with a sketch that was doing something else/more. I know the first few revisions of v2 had a problem where after a certain number of BLE messages, the heap would fill and it would stop sending. Anything like that?
FYI: Just tried the same. no issues it keeps counting...
Board: Sparkfun RedBoard Artemis ATP Arduino Version: 1.8.13 Sparkfun Apollo3 Boards version: 2.0.6 Operating System: Ubuntu 20.4 Terminal Program: Arduino Serial Monitor Sketch: as was posted by nigelb (copy/paste/compile/run)
Count: 1754 1755756 Count: 1755 1756757 Count: 1756 1757758 Count: 1757 1758759
regards, Paul
Update: Next day and I am still running
Count: 44615 44659616
Count: 44616 44660617
Count: 44617 44661618
Count: 44618 44662619
Count: 44619 44663620
@nigelb am still very interested in this, as it was reproducible between you and a friend. Let me know if you can figure out what we are doing different.
Hi All,
In my original post I did not mention that my friend and I are running Windows 10 and are using the Arduino Serial Monitor.
I just tested this issue on my Ubuntu PC:
Board: Sparkfun RedBoard Artemis ATP Arduino Version: 1.8.13 Sparkfun Apollo3 Boards version: 2.0.6 Operating System: Ubuntu 20.04 Terminal Program: Arduino Serial Monitor
and after compiling and flashing the ATP it did not hang:
.
.
.
Count: 1759 1760760
Count: 1760 1761761
Count: 1761 1762762
Count: 1762 1763763
Count: 1763 1764764
Count: 1764 1765765
Surprisingly the MD5 sum of the firmwares compiled on my Win10 and Linux machines matched: Windows:
$ md5sum.exe Modified_Serial_2.ino.bin
0f61028df18d8832479b2c49fcb8068a *Modified_Serial.ino.bin
Linux:
$ md5sum Modified_Serial_2.ino.bin
0f61028df18d8832479b2c49fcb8068a Modified_Serial.ino.bin
So I copied the firmware compile on my Win10 machine over to my Linux machine and flashed it onto my ATP board:
$ /home/user/.arduino15/packages/SparkFun/hardware/apollo3/2.0.6/tools/uploaders/svl/dist/linux/svl /dev/ttyUSB0 -f /home/user/Modified_Serial.ino.bin -b 921600 -v
Artemis SVL Bootloader
Script version 1.7
Phase: Setup
Cleared startup blip
Got SVL Bootloader Version: 5
Sending 'enter bootloader' command
Phase: Bootload
have 118944 bytes to send in 59 frames
Sending frame #1, length: 2048
Sending frame #2, length: 2048
Sending frame #3, length: 2048
Sending frame #4, length: 2048
Sending frame #5, length: 2048
Sending frame #6, length: 2048
Sending frame #7, length: 2048
Sending frame #8, length: 2048
Sending frame #9, length: 2048
Sending frame #10, length: 2048
Sending frame #11, length: 2048
Sending frame #12, length: 2048
Sending frame #13, length: 2048
Sending frame #14, length: 2048
Sending frame #15, length: 2048
Sending frame #16, length: 2048
Sending frame #17, length: 2048
Sending frame #18, length: 2048
Sending frame #19, length: 2048
Sending frame #20, length: 2048
Sending frame #21, length: 2048
Sending frame #22, length: 2048
Sending frame #23, length: 2048
Sending frame #24, length: 2048
Sending frame #25, length: 2048
Sending frame #26, length: 2048
Sending frame #27, length: 2048
Sending frame #28, length: 2048
Sending frame #29, length: 2048
Sending frame #30, length: 2048
Sending frame #31, length: 2048
Sending frame #32, length: 2048
Sending frame #33, length: 2048
Sending frame #34, length: 2048
Sending frame #35, length: 2048
Sending frame #36, length: 2048
Sending frame #37, length: 2048
Sending frame #38, length: 2048
Sending frame #39, length: 2048
Sending frame #40, length: 2048
Sending frame #41, length: 2048
Sending frame #42, length: 2048
Sending frame #43, length: 2048
Sending frame #44, length: 2048
Sending frame #45, length: 2048
Sending frame #46, length: 2048
Sending frame #47, length: 2048
Sending frame #48, length: 2048
Sending frame #49, length: 2048
Sending frame #50, length: 2048
Sending frame #51, length: 2048
Sending frame #52, length: 2048
Sending frame #53, length: 2048
Sending frame #54, length: 2048
Sending frame #55, length: 2048
Sending frame #56, length: 2048
Sending frame #57, length: 2048
Sending frame #58, length: 2048
Sending frame #59, length: 160
Upload complete
Nominal bootload bps: 44413.86
:tada: The hanging issue no longer occurs:
.
.
.
Count: 1628 1629629
Count: 1629 1630630
Count: 1630 1631631
Count: 1631 1632632
Count: 1632 1633633
At this point, on the Win10 machine, I closed Arduino, and deleted the C:\Users\user\AppData\Local\Arduino15\
directory.
Then after reinstalling the Sparkfun Apollo3 Boards version: 2.0.6
and re-flashing my ATB board:
.
.
.
Count: 1500 1501501
Count: 1501 1502502
Count: 1502 1503503
Count: 1503 1504504
The hanging issue no longer occurs. I will confirm that this fixes the issue for my friend as well.
Reopened after a report from @nseidle that he was getting the same issue.
A failure after ~23 minutes. His original example was a BLE project that would quit after 23 minutes, but he is able to reproduce the problem with this sketch
int count = 0;
void setup() {
Serial.begin(115200); // set the baud rate with the begin() method
Serial.println("\n\nApollo3 - Serial");
}
void loop() {
Serial.print("Count: ");
Serial.print(count++);
Serial.print(" ");
Serial.println(micros());
delay(10);
}
I am trying to reproduce the problem on my end.
I'll do the same on 2.10.
Thanks @paulvha, any luck? I should add that he was running with: Windows 10 Arduino v1.8.13 v2.1.0 core Artemis Nano
I used the same platform and version and am not getting the problem. This is an interesting problem.
it is now close to one hour .. still running on the ATP, but from the start making the Ubuntu system slow. Maybe this is not an Artemis issue but is the serial monitor on Nate's system running out of buffers.... Ill keep it running..
regards Paul
Van: Wenn0101 @.> Verzonden: maandag 24 mei 2021 20:38 Aan: sparkfun/Arduino_Apollo3 @.> CC: paulvha @.>; Mention @.> Onderwerp: Re: [sparkfun/Arduino_Apollo3] Artemis ATP firmware compiled with Arduino_Apollo3 versions 2.x.x Hangs after about 1430 seconds (clock time) (#388)
Thanks @paulvhahttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpaulvha&data=04%7C01%7C%7Cc257d8629989498c8e6908d91ee32bec%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637574783305741264%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HGGt%2B2iWRVViKtY%2FIgjSL%2BdIqOhYrX3FmRryv%2BDylZY%3D&reserved=0, any luck? I should add that he was running with: Windows 10 Arduino v1.8.13 v2.1.0 core Artemis Nano
I used the same platform and version and am not getting the problem. This is an interesting problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsparkfun%2FArduino_Apollo3%2Fissues%2F388%23issuecomment-847249631&data=04%7C01%7C%7Cc257d8629989498c8e6908d91ee32bec%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637574783305751259%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WlmT%2B%2B0V7sltHi9Gw8MjAjZLp3sZhi%2FCpcZ1KMvPDfU%3D&reserved=0, or unsubscribehttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAD2GBPENOHJAH2YMC45O273TPKMLRANCNFSM44GCVINA&data=04%7C01%7C%7Cc257d8629989498c8e6908d91ee32bec%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637574783305751259%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=2uIW1p%2FaH40VC5hlvBpexO4qRIPtzaiRaxCW61czFBY%3D&reserved=0.
I think its unlikely to be a serial issue. The original observation from Nate was a BLE sketch (that was not connected over serial), that simply stopped showing up and blinking a heartbeat LED after 23 minutes.
I am also unable to reproduce the problem. I am trying to figure out why this problem seems to happen to only some users. I find it interesting that nigel was able to fix the problem on his end by re-installing, and that the compiled binaries appeared to be the same between working an non-working builds. perhaps there is a problem with the tools, but I cant see how it would cause this issue. Mostly just typing my thought right now,
more than 2.5 hours and counting.. the computer is still terrible slow.. but the ATP is still running. I will let it run for the night... maybe tell Nate to get a fast computer with a real OS as I still think this is a buffer issue 🙂. The overhead on BLE (having studied that in detail) is huge. A buffer overrun is easily happening.
regards Paul
Van: Wenn0101 @.> Verzonden: maandag 24 mei 2021 21:25 Aan: sparkfun/Arduino_Apollo3 @.> CC: paulvha @.>; Mention @.> Onderwerp: Re: [sparkfun/Arduino_Apollo3] Artemis ATP firmware compiled with Arduino_Apollo3 versions 2.x.x Hangs after about 1430 seconds (clock time) (#388)
I think its unlikely to be a serial issue, since the original observation from nate was a BLE sketch (that was not connected over serial), that simply stopped showing up and blinking a heartbeat LED after 23 minutes.
I am also unable to reproduce the problem. I am trying to figure out why this problem seems to happen to only some users. I find it interesting that nigel was able to fix the problem on his end by re-installing, and that the compiled binaries appeared to be the same between working an non-working builds. perhaps there is a problem with the tools, but I cant see how it would cause this issue. Mostly just typing my thought right now,
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsparkfun%2FArduino_Apollo3%2Fissues%2F388%23issuecomment-847278809&data=04%7C01%7C%7C50578ae70977487ac9f308d91ee9aeba%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637574811261909898%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HZfXK9pWKcuIqIF33KUwCqoeL5ovA5YMbRwAfnfA4xY%3D&reserved=0, or unsubscribehttps://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAD2GBPFXCWKLVK4ASW5FPX3TPKR2JANCNFSM44GCVINA&data=04%7C01%7C%7C50578ae70977487ac9f308d91ee9aeba%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637574811261919895%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=W2tr1GF2L8XjjJsnPE7a6yV9oyw5b%2FhMoPTPY3k7iKI%3D&reserved=0.
it has been running for more than 12 hours without a problem. Count: 3665488 919823695 Count: 3665489 919835698 Count: 3665490 919847697 Count: 3665491 919859695 Count: 3665492 919871695 Count: 3665493 919883698 Count: 3665494 919895697 Count: 3665495 919907695 Count: 3665496 919919695 Count: 3665497 919931698 Count: 3665498 919943697 Count: 3665499 919955695 Count: 3665500 919967695 Count: 3665501 919979698 switching it off now.
regards, Paul
Thanks for the help Paul! This testing is so valuable in helping me figure narrow down who this affects.
-Kyle
Ok, here is what I have found.
The problem appeared to be mostly board specific - a failing board would typically fail regardless of the computer it was being used on, and a nonfailing board would work on computers originally thought to be suspect.
Interesting information that lead to the resolution: The stimer will rollover in 2^32/3M = 1431.65 seconds. The stimer overflow interrupt is not handled properly, because the stimer overflow interrupt is not enabled. Suggesting - For some reason, the stimer rollover interrupt is firing (only on the affected boards), causing the undefined behavior.
Explicitly disabling this interrupt in the us_ticker setup should resolve this issue on affected boards.
am_hal_stimer_int_disable(AM_HAL_STIMER_INT_OVERFLOW);
I have confirmed this on the 1 board I have that would fail, waiting on Nate to confirm that the problem is resolved on his boards.
interesting... could it be a leftover from the SVL bootloader, where it is set and used to autodetect the baud rate. It does perform a disable before starting the loaded app, but maybe it not happening on all boards. timing ?? I load the sketches on the ATP board with the ASB uploader.
I was thinking maybe an older version of the SVL bootloader doesn't disable it before loading the app. I did notice all of the "problem" boards are older. I haven't looked into it yet, but I like the reasoning that it could be the bootloader. Either way I think the fix is to have the app explicitly disable it, to be safe.
look at this post that I saw today... https://forum.sparkfun.com/viewtopic.php?f=168&t=5139. They had the same issue. regards Paul
@Wenn0101 and @paulvha my friend actually could not fix the issue by re-installing. When I think back I believe I tried both the SVL and ASB bootloaders and managed to break the SVL bootloader somehow. I had to "Burn Bootloader" to fix the issue. In the same test iteration I re-installed the "Sparkfun Apollo3 Boards" platform. After this everything was working. My friend is going to try updating the bootloader and see if this solves his problem. We both purchased our boards in around 2019.
This should be fixed as of v2.1.1
Board: Sparkfun RedBoard Artemis ATP Arduino Version: 1.8.13 Sparkfun Apollo3 Boards version 2.0.6, 2.0.3, and I assume the versions in between. This issue does not occur in version 1.2.1.
With the following sketch (derived from this example):
After uploading and running it hangs at step 1430 (or, occasionally 1431):
It seems to happen after a certain amount of time running because if I change the delay to 100 we get this:
I have tested this on two different ATP boards and had a friend try on his as well. This Issue occurred in all cases that was compiled with the V2 series Arduino_Apollo3.