Closed phil-pilgrim closed 4 years ago
Have you tried making _t volatile? I suspect that the optimizer sees that you're not really doing anything with _t and optimizes it away.
The optimizer obviously missed the fact that I do use _t at the end of the loop. (It's used nowhere else.) Moreover, if I do this:
int _t = CNT;
pulse_out(0, _t + 100);
and observe the output on a 'scope, the pulse is always 100usec wide, so _t is always zero.
int _t = CNT; pulse_out(0, _t + 100);
and observe the output on a 'scope, the pulse is always 100usec wide, so _t is always zero.
Have you checked with BlocklyProp's Code button to ensure that your c code in the custom block is not getting morphed somehow?
Yes. It doesn't get changed in any way.
I can't explain what's happening to your project, but in my tests, it works. Here's what I tried:
I added a terminal block at the top of code (just to get the terminal initialized and open) then created these blocks, including a custom block, and I can see that _t gets CNT in my case and appears to also be operated on by my incrementor.
Here's the code: The statements below the pause(1000);
are exactly as I entered them into the "main code" area of my custom block.
/* SERIAL_TERMINAL USED */
// ------ Libraries and Definitions ------
#include "simpletools.h"
// ------ Main Program ------
int main() {
term_cmd(CLS);
while (1) {
pause(1000);
int _t = CNT;
_t += 100;
print("%d\r", _t);
}
}
As I was trying to say...
This is what the optimizer thinks:
forever // while(1)
{
declare local variable _t
assign _t to CNT
do something else
while CNT - _t < immediate value do nothing
}
When the optimizer reaches the "while CNT..." instruction, it thinks: Hey, wait a minute, local variable _t is initialized to CNT and nothing was done with it since initialization. So the expression (CNT - _t) must be 0. It removes local variable _t and replaces CNT - _t with 0.
Again, try declaring _t as: volatile _t = CNT. I bet that way the optimizer won't touch it.
(BTW I haven't looked at the propeller header files lately... is CNT declared as volatile? It really should be. If it isn't, someone should change it and that might fix the problem too. Sorry I don't have time to do this myself at the moment).
Thanks @jacgoudsmit that helped be better understand exactly what you meant, but I don't think the optimizer is at play here as I don't seem to experience the same issue.
I've tried changing my code to all kinds of variations, including this one:
int _t = CNT;
for (int i = 0; i < 10; i++) {
pause(10);
}
print("%d\r", CNT - _t);
And I never get 0, or any other unreasonable value. The above code, by the way, outputs an endless series of 8010160, indicating the prior for loop took nearly that many cycles. Without the for loop, I get 240 in that case.
I can't explain what's happening to your project, but in my tests, it works. Here's what I tried:
while (1) { pause(1000); int _t = CNT; _t += 100; print("%d\r", _t); }
The optimizer probably removes local variable _t in this code as well. I bet if you would change the code so that it doesn't declare _t and uses print("%d\r", CNT + 100); it will generate the exact same assembly code.
void run_servos() { int _20ms = CLKFREQ / 50; while (1) { int _t = CNT; for (int _i = 0; _i <= 2; _i++) { ... } while (((CNT) - _t) < _20ms) {} } }
When run, _t never gets set to CNT; it stays at zero -- unless I comment out the entire for loop. However, when I declare _t as a global, it gets assigned as it should. As a consequence, I've become leery of using any local variables in my code.
@phil-pilgrim I think the key here is "unless I comment out the entire for loop". I'm betting that commenting out the for loop but keeping it's contents (which we can't see here) will also result in the same problem. Take a closer look at what your for loop is doing, and share with us too if you'd like.
Commenting out the for loop did not result in the same problem: _t was properly assigned with the value from CNT. BTW, _t is not mentioned in the for loop, so could not have gotten changed there. The only thing that worked with the for loop intact was making _t global. So I did the same to the other local variables, just to avoid further problems.
I did not keep the troublesome code, but here's the code that works. The only difference is that the scalar variables below the arrays used to be local.
#define MY_MAXSPD 500
#define MY_MINSPD 20
int _error[3];
int _perror[3];
int _ramp[3];
int _servo[3] = {17, 15, 14};
int _feedback[3] = {16, 11, 10};
int _targpos[3] = {-1, -1, -1};
int _speed[3] = {0, 0, 0};
int _servospd[3]= {0, 0, 0};
int _currspd[3] = {0, 0, 0};
int _currpos[3];
int _prevpos[3];
int _force[3];
int _ierror[3];
int _t;
int _t1;
int _t2;
int _pos;
int _err;
int _ierr;
int _i;
int _targspd;
void run_servo(int i, int pos, int spd) {
if (i >= 0 && i <= 2) {
_targpos[i] = -1;
if (pos >= 0) {
_ramp[i] = 0;
_servospd[i] = 0;
_prevpos[i] = _currpos[i];
_speed[i] = constrainInt(spd, MY_MINSPD, MY_MAXSPD);
_targpos[i] = constrainInt(pos, 10, 990);
pause(1000);
while (_error[i] > 3) {pulse_out(1, _currspd[i]);}
}
}
}
void run_servos() {
int _20ms = CLKFREQ / 50;
while (1) {
_t = CNT;
for (_i = 0; _i <= 2; _i++) {
_t1 = (pulse_in(_feedback[_i], 1));
_t2 = (pulse_in(_feedback[_i], 0));
_pos = constrainInt(( ((1000 * _t1) / (_t1 + _t2)) - 29) * 1000 / 941, 0, 999);
if (abs(_pos - _prevpos[_i]) > 500) {
if (_pos < 500) {
_pos += 1000;
} else {
_pos -= 1000;
}
}
if (_targpos[_i] > 0) {
_err = abs(_targpos[_i] - _pos);
_ierr = _error[_i] + _err;
if (_err < 3) {
pulse_out(_servo[_i], 1500);
_ierr = 0;
} else {
_currspd[_i] = abs(_pos - _prevpos[_i]) << 3;
_targspd = constrainInt(_err, MY_MINSPD, MY_MAXSPD);
if (_targspd > _ramp[_i]) _targspd = _ramp[_i];
if (_targspd > _speed[_i]) _targspd = _speed[_i];
if (_targspd > _currspd[_i] || _err >= _perror[_i]) {
if (_servospd[_i] < MY_MAXSPD) _servospd[_i] += 1;
} else if (_targspd < _currspd[_i] || _err < _perror[_i]) {
if (_servospd[_i] > MY_MINSPD) _servospd[_i]-= 1;
}
if (_targpos[_i] > _pos) {
pulse_out(_servo[_i], 1500 - _servospd[_i]);
} else {
pulse_out(_servo[_i], 1500 + _servospd[_i]);
}
pulse_out(0, _targspd);
}
_perror[_i] = _error[_i];
_error[_i] = _err;
_ierror[_i] = _ierr;
_prevpos[_i] = _currpos[_i];
_currpos[_i] = _pos;
}
if (_ramp[_i] < MY_MAXSPD) _ramp[_i]++;
}
while (((CNT) - _t) < _20ms) {}
}
}
The optimizer probably removes local variable _t in this code as well. I bet if you would change the code so that it doesn't declare _t and uses print("%d\r", CNT + 100); it will generate the exact same assembly code.
I'll bet what's happening is that the optimizer thinks that CNT is a variable that remains static, rather than something that's contantly incrementing on its own. So when it sees _t = CNT, then later, CNT - _t, it deletes the assignment, changes CNT - _t to zero, and optimizes out the while loop.
Edit: Sorry, Jac, you said the same thing above, and I missed it.
Of course, this would be bad behavior.
OTOH, that doesn't explain why it worked without the for loop or, for that matter, with _t declared as a global.
Well, I tried declaring _t as volatile, but the compiler returned a warning that the type defaulted to int. So I left _t as a global and returned the other newly-minted globals back to locals. Program works fine now.
Well, I tried declaring _t as volatile, but the compiler returned a warning that the type defaulted to int. So I left _t as a global and returned the other newly-minted globals back to locals. Program works fine now.
That warning message just tells you you forgot to declare the type. You probably wrote "volatile _t = CNT;" instead of "volatile int _t = CNT;"
===Jac
Yup, that's what I did! (I'm not very C-worthy.)
I compiled this C code
#include <propeller.h>
void run_servos() {
int _20ms = CLKFREQ / 50;
while (1) {
int _t = CNT;
for (int _i = 0; _i <= 2; _i++) {
dummycall();
}
while (((CNT) - _t) < _20ms) {}
}
}
and got this assembly code.
.text
.balign 4
.global _run_servos
_run_servos
mov __TMP0,#(3<<4)+13
call #__LMM_PUSHM
mvi r7,#__clkfreq
mov r1, #50
rdlong r0, r7
call #__UDIVSI
mov r14, r0
.L3
mov r13, CNT
lcall #_dummycall
lcall #_dummycall
lcall #_dummycall
jmp #__LMM_FCACHE_LOAD
long .L6-.L5
.L5
.L2
mov r7, CNT
sub r7, r13
cmp r7, r14 wz,wc
IF_B jmp #__LMM_FCACHE_START+(.L2-.L5)
jmp __LMM_RET
.compress default
.L6
brs #.L3
It looks like the assembly code is doing the right thing. I used -Os for the optimization level.
Does anybody see anything wrong with the generated assembly code?
I'm not in a position to evaluate the asm code produced. But what does -Os signify?
Please remember that we're discussing default behavior based upon Blockly code, wherein optimization levels can't be -- or aren't normally -- specified. In any event, regardless of what the programmer wrote,
while (((CNT) - _t) < _20ms) {}
should never be optimized out under any circumstances, as it appears to have been in my example.
Thanks, Dave!
-Os optimizes for size. I also tried -O1, -O2 and -O3, and they all generated similar code, except that -O1 did not use the FCACHE. The FCACHE copies small loops into cog memory, and executes them directly within the cog instead of using LMM. -O1 uses LMM for the while loop.
I didn't see the while loop optimized out in any of the cases I checked. In your OP you stated that _t never gets set to CNT, and stays at zero. Can you post some C code that shows that happening?
See this post: https://github.com/parallaxinc/propgcc/issues/80#issuecomment-592611287
There seem to be two distinct things happening around the treatment of CNT: the zero assignment, and the while loop being optimized out when _t is a local static variable. In the latter case, that optimization would actually make sense if CNT were also static which, of course, it's not.
I don't see evidence for that. In the code I posted _t is stored in r13, and is set to the value of CNT before entering the loop. This is in the line that reads "mov r13, CNT".
The loop starts at .L2 as follows:
.L2
mov r7, CNT
sub r7, r13
cmp r7, r14 wz,wc
IF_B jmp #__LMM_FCACHE_START+(.L2-.L5)
The current value of CNT is stored in r7, and then the elapsed cycles are computed by subtracting _t (AKA r13) from r7. It jumps back to .L2 (AKA _LMM_FCACHE_START) if the elapsed cycles are less than r14, which has a value of _clkfreq/50.
I wonder if the pairing of cmp with IF_B is correct. Maybe it's backwards. I think CNT is defined as unsigned, but _t and _20ms are both defined as int, which is signed. Maybe this is confusing the conditional. I'll check into that when I have a chance.
I wonder if the pairing of cmp with IF_B is correct.
It looks correct to me. In view of this post:
https://github.com/parallaxinc/propgcc/issues/80#issuecomment-592694831
I wonder if the length of the for loop has any kind of effect. When I commented out that loop, I got the correct behavior. Could r13, say, have gotten reused inside the for loop? That might explain why changing _t to a global also fixed the problem.
Phil, could you post some code that fails in the for loop, and I'll take a look at the generated assembly.
Dave,
Thanks, but so far I've not been able to reconstruct the original code that manifested the problem. I'll keep at it.
In the meantime, have there been any changes made to the compiler since my original post? (I'm using the compiler pointed to by Matt's experimental Solo site.)
I'm not familiar with Matt's experimental Solo site. I use the GCC compiler that is bundled with SimpleIDE that is dated May 24, 2018. I know there have been efforts to update PropGCC with later versions of GCC, but I don't use any of those.
You may want to start a thread on the Parallax forum about the problem you encountered. You'll reach a much larger audience that way, and somebody may be able to help you on the forum. If you can't reproduce the problem it's probably not fair to keep this as an open issue on GitHub.
I agree. I woke up remembering that the problem occurred on solo.parallax.com, not on Matt's site. So I tried a couple experiments there this morning, but couldn't reproduce the issue.
Thank you both for exploring this mystery further. We have not changed anything in Solo that I'm aware of which would affect the results of this, and in my experiments from the original post, I could not reproduce the problem either. For now, I'm closing this issue. Please reopen if more information becomes available to share/explore.
I have C code that gets compiled in a custom block under Blockly Solo that reads like this:
When run, _t never gets set to CNT; it stays at zero -- unless I comment out the entire for loop. However, when I declare _t as a global, it gets assigned as it should. As a consequence, I've become leery of using any local variables in my code.