Closed runger1101001 closed 1 year ago
Hi @runger1101001 , that's very strange (given that the same code is working on the Minima). We use AGT0 as millis() timer, so it's not uncommon to see its isr being called. Would you mind sharing a minimal reproducible sketch that triggers the issue? Thanks a lot!
Ciao @runger1101001 , please share the code which produces this error, then I will be happy to check it out. I am awake the next 1 hour, so if you are chap-chap, I will look at it tonight.
The most recent weeks I have seen a bit scattered messages in various fora here dealing with R4, where interrupt routines of various sorts are crashing - and they have all of them Serial.print() statements in them. So. Just share the code here, and I will indeed check it out.
Hey, thank you kindly for the quick response.
To prepare a proper example will take me a little bit.
The quick version is this:
#include "Arduino.h"
#include "SimpleFOC.h"
BLDCDriver3PWM driver = BLDCDriver3PWM(D3, D4, D12, D8);
BLDCMotor motor = BLDCMotor(7);
void setup() {
Serial.begin(115200);
SimpleFOCDebug::enable();
delay(10000);
Serial.println("Driver init...");
driver.voltage_power_supply = 10.0f;
driver.voltage_limit = 5.0f;
if (driver.init()==0)
Serial.println("Driver init failed.");
motor.linkDriver(&driver);
motor.voltage_limit = 2.4f;
motor.torque_controller = TorqueControlType::voltage;
motor.controller = MotionControlType::velocity_openloop;
motor.init();
Serial.println("Init complete.");
}
void loop() {
//Serial.println("Running...");
delayMicroseconds(100);
motor.move(10); // 10 rad/s
}
I seem not being able to find the origin of the SIMPLEFOC_DEBUG - I cannot find its definition. Which file is SIMPLEFOC_DEBUG() located in?
I can package this in a bit better example over the weekend.
The renesas-specific code can be found here: https://github.com/simplefoc/Arduino-FOC/tree/dev/src/drivers/hardware_specific/renesas
Thanks for the information about the AGT. The timer code in question explicitly excludes AGT timers, so there should be no conflict there.
I wonder if the AGT interrupt is actually the problem... perhaps its just confusing me because I'm trying to use the timers and the AGT interrupt routine is on the stack, but actually they're not linked.
I wonder about the top entry on the stack
0x000000c0
??
??:0
Could this be a problem of jumping to a bad address? I was reading bugs related to memory corruption or missing return statements in functions. Perhaps its more to do with something like that?
I seem not being able to find the origin of the SIMPLEFOC_DEBUG - I cannot find its definition. Which file is SIMPLEFOC_DEBUG() located in?
Its here: https://github.com/simplefoc/Arduino-FOC/blob/dev/src/communication/SimpleFOCDebug.h
Its intention is to be able to disable all the debug statements per pre-processor to save memory. We're very tight on memory on some MCUs like the original UNO.
It furthermore occurs to me, that if the entire of the debug chain of functions is not 100% re-entrant, then you are at risk of re-calling a function in the debug chain of code, which then messes up call stack and everything, leaving you with a crashed code - not always but often.
I cannot see if the code is reentrant all the way through - it looks OK - but I can only say that if you have non-reentrant calls executed from any interrupt-routine, then you see crashes.
I have just looked into SimpleFOCDebug.cpp - and it appeared fine - but you may have to check out the entire call tree.
I think that you would be doing well in checking that part. I may be wrong though. Welcome to write again.
Question: Does it ALWAYS crash? Or does it only crash sometimes, and periodically and unpredictably? Question: Does it crash from an interrupt routine, or without any interrupt routine being invoked? Question: Have you tried establishing WHERE it crashes in your code? If so - does it always crash from the same place? Question: Have you tried isolating interrupt-algorithms (by blocking interrupts at certain parts of the code? Question: The print happening in the debug section - does it call the Serial. print - or where does the output of the print end?
ON INTERRUPTS calling non-reentrant code. Just an example into details of what can go wrong:
Could you try to remove all debug printing and all printing as such, from the interrupted code? (comment it out, and mark the comments by starting the line with // RRRR, so that you do not confuse these comments with other comments when you want to reinstate them)
So, if you have an algorithm which is being called both from the normal loop-code, and ALSO from any interrupt-routine - then do not print from it. I have lately seen lots and lots of people who are calling non-reentrant codes in C++ from more than one point. That always fails.
The problem is roughly this one here with the above:
The usual call from the tree stemming from the loop() is executing the function, ie. ffffff()
Then at some point within the ffffff() routine, some non reentrant variables have been set, for instance a counter of characters to print out is reset. The printing starts, and the counter counts up to for instance 17 of the maybe 25 characters to print. Then the interrupt is being called by ie. the timer function. It stops the original printing, pushes all registers of the processor on the stack, but does not preserve variables. So. At the time when the interrupt is calling ffffff(), the counter has reached to 17. Then the interrupt function starts - and calls ffffff() - the counter is reset to 0 again, and starts counting. The interrupt routine is maybe printing a longer string, say, 40 characters where the one printed from the loop is 25. The algorithm runs to the end, and the last character at position 39 is being printed. The interrupt routine returns, the registers restores, and the original instance is continued. Now, however, the counter have reached 39 (character 40 is on index 39). At that point the character 39 is way beyond the end character at position 24 (the 25th character is at index 24). So now the printing continues until a 0 (zero) is met. That can be way into other code, and now whether this crashes is totally up to coincidences. Hope it is useful?
@runger1101001 any news - have you got things to work? Or identified where the issue may be?
Thanks a lot for all the feedback! Sorry I'm not quite as fast as you are. :-)
Regarding the re-entrant code: Actually my Renesas PWM driver is not using any interrupts. While we have some code using interrupts in our sensor drivers, this is not being used in the example. So there should be no interrupts in play from my side.
Of course it is a very valid point you make regarding the reentrant code, and using interrupts can be very tricky, so if possible I structure our drivers to avoid them (by using other facilities of the MCUs).
Could you try to remove all debug printing and all printing as such, from the interrupted code?
I'm in the process of doing that, and will report on my findings.
One thought that occurred to me is that there could be a mismatch between the FSP library and the headers I'm compiling against in the core. It's a bit hard to follow that build process, but it seems to me FSP is compiled from a slightly customised branch of v4.0.0. The current version however is v4.6.0, including many bug fixes, some of which affect the timer code.
So last night I was trying to re-compile the framework with FSP v4.6.0 so see if this helps, but its slow going getting that set up.
Timers are interruptible by nature???
Aren't you using timers?
On Sat, 23 Sep 2023 12:12 runger1101001, @.***> wrote:
Thanks a lot for all the feedback! Sorry I'm not quite as fast as you are. :-)
Regarding the re-entrant code: Actually my Renesas PWM driver is not using any interrupts. While we have some code using interrupts in our sensor drivers, this is not being used in the example. So there should be no interrupts in play from my side.
Of course it is a very valid point you make regarding the reentrant code, and using interrupts can be very tricky, so if possible I structure our drivers to avoid them (by using other facilities of the MCUs).
Could you try to remove all debug printing and all printing as such, from the interrupted code?
I'm in the process of doing that, and will report on my findings.
One thought that occurred to me is that there could be a mismatch between the FSP library and the headers I'm compiling against in the core. It's a bit hard to follow that build process, but it seems to me FSP is compiled from a slightly customised branch of v4.0.0. The current version however is v4.6.0, including many bug fixes, some of which affect the timer code.
So last night I was trying to re-compile the framework with FSP v4.6.0 so see if this helps, but its slow going getting that set up.
— Reply to this email directly, view it on GitHub https://github.com/arduino/ArduinoCore-renesas/issues/139#issuecomment-1732262051, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUI6JVTMGLFEWQZ2FOTG3YDX32RV3ANCNFSM6AAAAAA464EQXQ . You are receiving this because you commented.Message ID: @.***>
What is it you are building with fwm and such stuff?
On Sat, 23 Sep 2023 12:12 runger1101001, @.***> wrote:
Thanks a lot for all the feedback! Sorry I'm not quite as fast as you are. :-)
Regarding the re-entrant code: Actually my Renesas PWM driver is not using any interrupts. While we have some code using interrupts in our sensor drivers, this is not being used in the example. So there should be no interrupts in play from my side.
Of course it is a very valid point you make regarding the reentrant code, and using interrupts can be very tricky, so if possible I structure our drivers to avoid them (by using other facilities of the MCUs).
Could you try to remove all debug printing and all printing as such, from the interrupted code?
I'm in the process of doing that, and will report on my findings.
One thought that occurred to me is that there could be a mismatch between the FSP library and the headers I'm compiling against in the core. It's a bit hard to follow that build process, but it seems to me FSP is compiled from a slightly customised branch of v4.0.0. The current version however is v4.6.0, including many bug fixes, some of which affect the timer code.
So last night I was trying to re-compile the framework with FSP v4.6.0 so see if this helps, but its slow going getting that set up.
— Reply to this email directly, view it on GitHub https://github.com/arduino/ArduinoCore-renesas/issues/139#issuecomment-1732262051, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUI6JVTMGLFEWQZ2FOTG3YDX32RV3ANCNFSM6AAAAAA464EQXQ . You are receiving this because you commented.Message ID: @.***>
Question: Does it ALWAYS crash? Or does it only crash sometimes, and periodically and unpredictably?
Always, but not exactly at the same time. Sometimes a few more characters are output by the serial, sometimes less. When I don't change the program, it always seems to crash in the same way. Changing the program can move the moment the crash occurs by a little.
Question: Does it crash from an interrupt routine, or without any interrupt routine being invoked?
No interrupts involved from my side. Only indirectly, if caused by use of Serial, or delayMicroseconds()
Question: Have you tried establishing WHERE it crashes in your code? If so - does it always crash from the same place?
No, it does not. It's very strange in that way.
Question: Have you tried isolating interrupt-algorithms (by blocking interrupts at certain parts of the code?
No, but I will try it.
Question: The print happening in the debug section - does it call the Serial. print - or where does the output of the print end?
Yes, it's an abstraction using the Stream class. But it gets initialised using the standard Serial object. The SIMPLEFOC_DEBUG is a macro that wraps println so that it can be compiled away, and so it automatically applies the FlashStringHelper. I'm disabling the debug output to see if this is somehow the cause and will report back.
So the answer is when I remove all the debug output, and also comment out the delayMicroseconds(), the crash still occurs in both cases.
The simplified test program is now:
#include "Arduino.h"
#include "SimpleFOC.h"
BLDCDriver3PWM driver = BLDCDriver3PWM(D3, D4, D12, D8);
void setup() {
Serial.begin(115200);
delay(10000);
driver.voltage_power_supply = 10.0f;
driver.voltage_limit = 10.0f;
if (driver.init()==0)
Serial.println("Driver init failed.");
driver.enable();
delay(1);
driver.setPwm(2.5f,5.0f,7.5f);
}
void loop() {
//Serial.println("Running...");
//delayMicroseconds(100);
}
Try to comment out all Serial.print() statements within the critical section where things crashes.
Surely you have a point in your code which is never reached, isn't it?
So.
Create a global, volatile int counter which you set to various values as your code progresses.
Like: Blabla blabla bla; tracker=7; Blsblssasaa bla bla bla blaaaa; tracker=8; etc Etc
Then comment out all debug code. Comment out all serial print statements.
Then you may - just may - be such lucky so that your code just works. If it does, your code was killed by multiple non reentrant code executions.
After the critical section, switch off interrupts, and print out the integer and you will know how far your code got. Then switch on interrupts again.
Let me hear how that goes.
On Sat, 23 Sep 2023, 12:26 runger1101001, @.***> wrote:
Question: Does it ALWAYS crash? Or does it only crash sometimes, and periodically and unpredictably?
Always, but not exactly at the same time. Sometimes a few more characters are output by the serial, sometimes less. When I don't change the program, it always seems to crash in the same way. Changing the program can move the moment the crash occurs by a little.
Question: Does it crash from an interrupt routine, or without any interrupt routine being invoked?
No interrupts involved from my side. Only indirectly, if caused by use of Serial, or delayMicroseconds()
Question: Have you tried establishing WHERE it crashes in your code? If so
- does it always crash from the same place?
No, it does not. It's very strange in that way.
Question: Have you tried isolating interrupt-algorithms (by blocking interrupts at certain parts of the code?
No, but I will try it.
Question: The print happening in the debug section - does it call the Serial. print - or where does the output of the print end?
Yes, it's an abstraction using the Stream class. But it gets initialised using the standard Serial object. The SIMPLEFOC_DEBUG is a macro that wraps println so that it can be compiled away, and so it automatically applies the FlashStringHelper. I'm disabling the debug output to see if this is somehow the cause and will report back.
— Reply to this email directly, view it on GitHub https://github.com/arduino/ArduinoCore-renesas/issues/139#issuecomment-1732264928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUI6JVVB2QNOP3ZZ27ZCZ4TX32TKZANCNFSM6AAAAAA464EQXQ . You are receiving this because you commented.Message ID: @.***>
On blocking interrupts, only block in main code..my proposal is to stop interrupts before serial prints and unblock after .
I would guess it could solve it.
On Sat, 23 Sep 2023, 12:26 runger1101001, @.***> wrote:
Question: Does it ALWAYS crash? Or does it only crash sometimes, and periodically and unpredictably?
Always, but not exactly at the same time. Sometimes a few more characters are output by the serial, sometimes less. When I don't change the program, it always seems to crash in the same way. Changing the program can move the moment the crash occurs by a little.
Question: Does it crash from an interrupt routine, or without any interrupt routine being invoked?
No interrupts involved from my side. Only indirectly, if caused by use of Serial, or delayMicroseconds()
Question: Have you tried establishing WHERE it crashes in your code? If so
- does it always crash from the same place?
No, it does not. It's very strange in that way.
Question: Have you tried isolating interrupt-algorithms (by blocking interrupts at certain parts of the code?
No, but I will try it.
Question: The print happening in the debug section - does it call the Serial. print - or where does the output of the print end?
Yes, it's an abstraction using the Stream class. But it gets initialised using the standard Serial object. The SIMPLEFOC_DEBUG is a macro that wraps println so that it can be compiled away, and so it automatically applies the FlashStringHelper. I'm disabling the debug output to see if this is somehow the cause and will report back.
— Reply to this email directly, view it on GitHub https://github.com/arduino/ArduinoCore-renesas/issues/139#issuecomment-1732264928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUI6JVVB2QNOP3ZZ27ZCZ4TX32TKZANCNFSM6AAAAAA464EQXQ . You are receiving this because you commented.Message ID: @.***>
Timers are interruptible by nature??? Aren't you using timers? What is it you are building with fwm and such stuff?
The MCUs timers can be run with or without interrupts. I'm currently not using any interrupts.
I use the MCUs timers to generate PWM signals for motor control. Our library, SimpleFOC, is all about controlling brushless motors using what is called "field oriented control".
Anyway, to run a BLDC motor you have to generate coordinated PWM outputs for motor's 3 phases, and this is what the timers are used for. They get used in "output mode" so to speak, for producing PWM and interrupts are not needed. The timers just run, and the duty cycles are updated by writing to the compare registers.
Why don't we use analogWrite() you may ask? Because we need very specific and coordinated control over the PWM functions which is not available via analogWrite(), or even via FspTimer. Our library has hardware specific PWM code for many MCU types - we try to support all the MCUs used by Arduino's boards, since we're an Arduino library.
Hi all. This is getting a bit noisy and off topic. Please keep the discussion here very tightly focused on @runger1101001's report as it relates to bug fixes or enhancements to the code hosted in this repository. Any other discussion must be made elsewhere. You can do that on the Arduino Forum, https://github.com/simplefoc/Arduino-FOC, etc., but not here.
The project developers and maintainers must monitor all activity in this repository and hundreds of other repositories. We must also review all the discussion in each issue or pull request while investigating or reviewing them. That becomes incredibly time consuming if there is a lot of off-topic/tangential/rambling discussion in the threads. That means we spend our time on that instead of doing productive work.
@Fashion-Corp I really appreciate that you are interested in contributing to the Arduino project, but this is not the way to do it. I invite you to participate in Arduino Forum. That is a perfect place for the sort of discussion you have been making in this repository. Your contributions over on the forum would be very valuable and welcome.
Ok. I get that, I understand that the timers are run kind of directly and continuous and that you therefore get some signalling from them.
However.
How then do you translate the signalling to the outputs if not via interrupts?
You can of course do it in realtime, that is you loop through, and when the timer setting is right (zero count) then you hit the outputs..
If that is the case, (polled timing), then you ought being able to use the millis() function directly with some modulo calculations for each phase?
But rather than guessing let me understand that part too, maybe with reference to pieces of code?
I went through your copy of the code but I think I would need a little guidance on the code etymology to be able to assist on where the bug may be. Like: Where do you read the timers, and where do you set each of the 3 phases (outputs).
On terms of BLDC motors I have had my share of fun with them some years ago. I was looking for my sketch but have not found it yet.
On 3 phase circuits for motors though, in general, I recall that the phasing and duty cycle is important. Something about the homogenicity. Like that the phases are the same.
So I guess you are up for something similar?
Sincerely David
On Sat, 23 Sep 2023, 23:24 runger1101001, @.***> wrote:
Timers are interruptible by nature??? Aren't you using timers? What is it you are building with fwm and such stuff?
The MCUs timers can be run with or without interrupts. I'm currently not using any interrupts.
I use the MCUs timers to generate PWM signals for motor control. Our library, SimpleFOC, is all about controlling brushless motors using what is called "field oriented control".
Anyway, to run a BLDC motor you have to generate coordinated PWM outputs for motor's 3 phases, and this is what the timers are used for. They get used in "output mode" so to speak, for producing PWM and interrupts are not needed. The timers just run, and the duty cycles are updated by writing to the compare registers.
Why don't we use analogWrite() you may ask? Because we need very specific and coordinated control over the PWM functions which is not available via analogWrite(), or even via FspTimer. Our library has hardware specific PWM code for many MCU types - we try to support all the MCUs used by Arduino's boards, since we're an Arduino library.
— Reply to this email directly, view it on GitHub https://github.com/arduino/ArduinoCore-renesas/issues/139#issuecomment-1732402620, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUI6JVSBP3QQXRGPRIEZDHDX35APLANCNFSM6AAAAAA464EQXQ . You are receiving this because you commented.Message ID: @.***>
I have a confusing issue I'm trying to track down involving the timers on Renesas. This is for a motor control library, and I'm trying to initialise the timers on the MCU in very specific ways.
Renesas support has already been of assistance with some details, and it is now working as expected on the UNO R4 Minima, but not on the R4 WiFi.
UNO R4 WiFi Arduino IDE 2.2.1 Renesas UNO Board Support 1.0.4
The crash occurs a short time after starting the timers:
etc... which decodes to:
It's quite unclear to me what is causing the crash.
The agt_int_asr is strange, because I'm using only GPT timers, not AGT. It does always show up in the stack traces though. Also it seems the stack traces always contain printlns (to Serial).
How can we narrow down the cause? What's the difference between Minima and WiFi that causes the error to happen only on the R4 WiFi?