parallaxinc / propgcc

A Port of GCC to the Parallax Propeller
Other
35 stars 17 forks source link

Local variables not getting assigned properly #80

Closed phil-pilgrim closed 4 years ago

phil-pilgrim commented 4 years ago

I have C code that gets compiled in a custom block under Blockly Solo that reads like this:

void run_servos() {
  int _20ms = CLKFREQ / 50;
  while (1) {
    int _t = CNT;
    for (int _i = 0; _i <= 2; _i++) {
    ...
    }
    while (((CNT) - _t) < _20ms) {}
  }
}

When run, _t never gets set to CNT; it stays at zero -- unless I comment out the entire for loop. However, when I declare _t as a global, it gets assigned as it should. As a consequence, I've become leery of using any local variables in my code.

jacgoudsmit commented 4 years ago

Have you tried making _t volatile? I suspect that the optimizer sees that you're not really doing anything with _t and optimizes it away.

phil-pilgrim commented 4 years ago

The optimizer obviously missed the fact that I do use _t at the end of the loop. (It's used nowhere else.) Moreover, if I do this:

    int _t = CNT;
    pulse_out(0, _t + 100);

and observe the output on a 'scope, the pulse is always 100usec wide, so _t is always zero.

PropGit commented 4 years ago
    int _t = CNT;
    pulse_out(0, _t + 100);

and observe the output on a 'scope, the pulse is always 100usec wide, so _t is always zero.

Have you checked with BlocklyProp's Code button to ensure that your c code in the custom block is not getting morphed somehow?

phil-pilgrim commented 4 years ago

Yes. It doesn't get changed in any way.

PropGit commented 4 years ago

I can't explain what's happening to your project, but in my tests, it works. Here's what I tried:

I added a terminal block at the top of code (just to get the terminal initialized and open) then created these blocks, including a custom block, and I can see that _t gets CNT in my case and appears to also be operated on by my incrementor.

image

Here's the code: The statements below the pause(1000); are exactly as I entered them into the "main code" area of my custom block.

/* SERIAL_TERMINAL USED */

// ------ Libraries and Definitions ------
#include "simpletools.h"

// ------ Main Program ------
int main() {

  term_cmd(CLS);
  while (1) {
    pause(1000);
    int _t = CNT;

    _t += 100;

    print("%d\r", _t);
  }

}
jacgoudsmit commented 4 years ago

As I was trying to say...

This is what the optimizer thinks:

forever // while(1)
{
  declare local variable _t
  assign _t to CNT
  do something else
  while CNT - _t < immediate value do nothing
}

When the optimizer reaches the "while CNT..." instruction, it thinks: Hey, wait a minute, local variable _t is initialized to CNT and nothing was done with it since initialization. So the expression (CNT - _t) must be 0. It removes local variable _t and replaces CNT - _t with 0.

Again, try declaring _t as: volatile _t = CNT. I bet that way the optimizer won't touch it.

(BTW I haven't looked at the propeller header files lately... is CNT declared as volatile? It really should be. If it isn't, someone should change it and that might fix the problem too. Sorry I don't have time to do this myself at the moment).

PropGit commented 4 years ago

Thanks @jacgoudsmit that helped be better understand exactly what you meant, but I don't think the optimizer is at play here as I don't seem to experience the same issue.

I've tried changing my code to all kinds of variations, including this one:

int _t = CNT;

for (int i = 0; i < 10; i++) {
  pause(10);
}

print("%d\r", CNT - _t);

And I never get 0, or any other unreasonable value. The above code, by the way, outputs an endless series of 8010160, indicating the prior for loop took nearly that many cycles. Without the for loop, I get 240 in that case.

jacgoudsmit commented 4 years ago

I can't explain what's happening to your project, but in my tests, it works. Here's what I tried:

  while (1) {
    pause(1000);
    int _t = CNT;

    _t += 100;

    print("%d\r", _t);
  }

The optimizer probably removes local variable _t in this code as well. I bet if you would change the code so that it doesn't declare _t and uses print("%d\r", CNT + 100); it will generate the exact same assembly code.

PropGit commented 4 years ago
void run_servos() {
  int _20ms = CLKFREQ / 50;
  while (1) {
    int _t = CNT;
    for (int _i = 0; _i <= 2; _i++) {
    ...
    }
    while (((CNT) - _t) < _20ms) {}
  }
}

When run, _t never gets set to CNT; it stays at zero -- unless I comment out the entire for loop. However, when I declare _t as a global, it gets assigned as it should. As a consequence, I've become leery of using any local variables in my code.

@phil-pilgrim I think the key here is "unless I comment out the entire for loop". I'm betting that commenting out the for loop but keeping it's contents (which we can't see here) will also result in the same problem. Take a closer look at what your for loop is doing, and share with us too if you'd like.

phil-pilgrim commented 4 years ago

Commenting out the for loop did not result in the same problem: _t was properly assigned with the value from CNT. BTW, _t is not mentioned in the for loop, so could not have gotten changed there. The only thing that worked with the for loop intact was making _t global. So I did the same to the other local variables, just to avoid further problems.

I did not keep the troublesome code, but here's the code that works. The only difference is that the scalar variables below the arrays used to be local.

#define MY_MAXSPD 500
#define MY_MINSPD 20

int _error[3];
int _perror[3];
int _ramp[3];
int _servo[3] = {17, 15, 14};
int _feedback[3] = {16, 11, 10};
int _targpos[3] = {-1, -1, -1};
int _speed[3] = {0, 0, 0};
int _servospd[3]= {0, 0, 0};
int _currspd[3] = {0, 0, 0};
int _currpos[3];
int _prevpos[3];
int _force[3];
int _ierror[3];
int _t;
int _t1;
int _t2;
int _pos;
int _err;
int _ierr;
int _i;
int _targspd;

void run_servo(int i, int pos, int spd) {
  if (i >= 0 && i <= 2) {
    _targpos[i] = -1;
    if (pos >= 0) {
      _ramp[i] = 0;
      _servospd[i] = 0;
      _prevpos[i] = _currpos[i];
      _speed[i] = constrainInt(spd, MY_MINSPD, MY_MAXSPD);
      _targpos[i] = constrainInt(pos, 10, 990);
      pause(1000);
      while (_error[i] > 3) {pulse_out(1, _currspd[i]);}
    }
  }
}

void run_servos() {
  int _20ms = CLKFREQ / 50;
  while (1) {
    _t = CNT;
    for (_i = 0; _i <= 2; _i++) {
      _t1 = (pulse_in(_feedback[_i], 1));
      _t2 = (pulse_in(_feedback[_i], 0));
      _pos = constrainInt(( ((1000 * _t1) / (_t1 + _t2)) - 29) * 1000 / 941, 0, 999);
      if (abs(_pos - _prevpos[_i]) > 500) {
        if (_pos < 500) {
          _pos += 1000;
        } else {
          _pos -= 1000;
        }
      }
      if (_targpos[_i] > 0) {
        _err = abs(_targpos[_i] - _pos);
        _ierr = _error[_i]  + _err;
        if (_err < 3) {
          pulse_out(_servo[_i], 1500);
          _ierr = 0;
        } else {
          _currspd[_i] =  abs(_pos - _prevpos[_i]) << 3;
          _targspd = constrainInt(_err, MY_MINSPD, MY_MAXSPD);
          if (_targspd > _ramp[_i]) _targspd = _ramp[_i];
          if (_targspd > _speed[_i]) _targspd = _speed[_i];
          if (_targspd > _currspd[_i] || _err >= _perror[_i]) {
            if (_servospd[_i] < MY_MAXSPD) _servospd[_i] += 1;
          } else if (_targspd < _currspd[_i] || _err < _perror[_i]) {
            if (_servospd[_i] > MY_MINSPD) _servospd[_i]-= 1;
          }
          if (_targpos[_i] > _pos) {
            pulse_out(_servo[_i], 1500 - _servospd[_i]);
          } else {
            pulse_out(_servo[_i], 1500 + _servospd[_i]);
          }
          pulse_out(0, _targspd);
        }
        _perror[_i] = _error[_i];
        _error[_i] = _err;
        _ierror[_i] = _ierr;
        _prevpos[_i] = _currpos[_i];
        _currpos[_i] = _pos;
      }
      if (_ramp[_i] < MY_MAXSPD) _ramp[_i]++; 
    }
    while (((CNT) - _t) < _20ms) {}
  }
}
phil-pilgrim commented 4 years ago

The optimizer probably removes local variable _t in this code as well. I bet if you would change the code so that it doesn't declare _t and uses print("%d\r", CNT + 100); it will generate the exact same assembly code.

I'll bet what's happening is that the optimizer thinks that CNT is a variable that remains static, rather than something that's contantly incrementing on its own. So when it sees _t = CNT, then later, CNT - _t, it deletes the assignment, changes CNT - _t to zero, and optimizes out the while loop.

Edit: Sorry, Jac, you said the same thing above, and I missed it.

Of course, this would be bad behavior.

OTOH, that doesn't explain why it worked without the for loop or, for that matter, with _t declared as a global.

phil-pilgrim commented 4 years ago

Well, I tried declaring _t as volatile, but the compiler returned a warning that the type defaulted to int. So I left _t as a global and returned the other newly-minted globals back to locals. Program works fine now.

jacgoudsmit commented 4 years ago

Well, I tried declaring _t as volatile, but the compiler returned a warning that the type defaulted to int. So I left _t as a global and returned the other newly-minted globals back to locals. Program works fine now.

That warning message just tells you you forgot to declare the type. You probably wrote "volatile _t = CNT;" instead of "volatile int _t = CNT;"

===Jac

phil-pilgrim commented 4 years ago

Yup, that's what I did! (I'm not very C-worthy.)

davehein commented 4 years ago

I compiled this C code

#include <propeller.h>

void run_servos() {
  int _20ms = CLKFREQ / 50;
  while (1) {
    int _t = CNT;
    for (int _i = 0; _i <= 2; _i++) {
      dummycall();
    }
    while (((CNT) - _t) < _20ms) {}
  }
}

and got this assembly code.

    .text
    .balign 4
    .global _run_servos
_run_servos
    mov __TMP0,#(3<<4)+13
    call    #__LMM_PUSHM
    mvi r7,#__clkfreq
    mov r1, #50
    rdlong  r0, r7
    call    #__UDIVSI
    mov r14, r0
.L3
    mov r13, CNT
    lcall   #_dummycall
    lcall   #_dummycall
    lcall   #_dummycall
    jmp #__LMM_FCACHE_LOAD
    long    .L6-.L5
.L5
.L2
    mov r7, CNT
    sub r7, r13
    cmp r7, r14 wz,wc
    IF_B    jmp #__LMM_FCACHE_START+(.L2-.L5)
    jmp __LMM_RET
    .compress default
.L6
    brs #.L3

It looks like the assembly code is doing the right thing. I used -Os for the optimization level.

Does anybody see anything wrong with the generated assembly code?

phil-pilgrim commented 4 years ago

I'm not in a position to evaluate the asm code produced. But what does -Os signify?

Please remember that we're discussing default behavior based upon Blockly code, wherein optimization levels can't be -- or aren't normally -- specified. In any event, regardless of what the programmer wrote,

while (((CNT) - _t) < _20ms) {}

should never be optimized out under any circumstances, as it appears to have been in my example.

Thanks, Dave!

davehein commented 4 years ago

-Os optimizes for size. I also tried -O1, -O2 and -O3, and they all generated similar code, except that -O1 did not use the FCACHE. The FCACHE copies small loops into cog memory, and executes them directly within the cog instead of using LMM. -O1 uses LMM for the while loop.

I didn't see the while loop optimized out in any of the cases I checked. In your OP you stated that _t never gets set to CNT, and stays at zero. Can you post some C code that shows that happening?

phil-pilgrim commented 4 years ago

See this post: https://github.com/parallaxinc/propgcc/issues/80#issuecomment-592611287

There seem to be two distinct things happening around the treatment of CNT: the zero assignment, and the while loop being optimized out when _t is a local static variable. In the latter case, that optimization would actually make sense if CNT were also static which, of course, it's not.

davehein commented 4 years ago

I don't see evidence for that. In the code I posted _t is stored in r13, and is set to the value of CNT before entering the loop. This is in the line that reads "mov r13, CNT".

The loop starts at .L2 as follows:

.L2
    mov r7, CNT
    sub r7, r13
    cmp r7, r14 wz,wc
    IF_B    jmp #__LMM_FCACHE_START+(.L2-.L5)

The current value of CNT is stored in r7, and then the elapsed cycles are computed by subtracting _t (AKA r13) from r7. It jumps back to .L2 (AKA _LMM_FCACHE_START) if the elapsed cycles are less than r14, which has a value of _clkfreq/50.

I wonder if the pairing of cmp with IF_B is correct. Maybe it's backwards. I think CNT is defined as unsigned, but _t and _20ms are both defined as int, which is signed. Maybe this is confusing the conditional. I'll check into that when I have a chance.

phil-pilgrim commented 4 years ago

I wonder if the pairing of cmp with IF_B is correct.

It looks correct to me. In view of this post:

https://github.com/parallaxinc/propgcc/issues/80#issuecomment-592694831

I wonder if the length of the for loop has any kind of effect. When I commented out that loop, I got the correct behavior. Could r13, say, have gotten reused inside the for loop? That might explain why changing _t to a global also fixed the problem.

davehein commented 4 years ago

Phil, could you post some code that fails in the for loop, and I'll take a look at the generated assembly.

phil-pilgrim commented 4 years ago

Dave,

Thanks, but so far I've not been able to reconstruct the original code that manifested the problem. I'll keep at it.

In the meantime, have there been any changes made to the compiler since my original post? (I'm using the compiler pointed to by Matt's experimental Solo site.)

davehein commented 4 years ago

I'm not familiar with Matt's experimental Solo site. I use the GCC compiler that is bundled with SimpleIDE that is dated May 24, 2018. I know there have been efforts to update PropGCC with later versions of GCC, but I don't use any of those.

You may want to start a thread on the Parallax forum about the problem you encountered. You'll reach a much larger audience that way, and somebody may be able to help you on the forum. If you can't reproduce the problem it's probably not fair to keep this as an open issue on GitHub.

phil-pilgrim commented 4 years ago

I agree. I woke up remembering that the problem occurred on solo.parallax.com, not on Matt's site. So I tried a couple experiments there this morning, but couldn't reproduce the issue.

PropGit commented 4 years ago

Thank you both for exploring this mystery further. We have not changed anything in Solo that I'm aware of which would affect the results of this, and in my experiments from the original post, I could not reproduce the problem either. For now, I'm closing this issue. Please reopen if more information becomes available to share/explore.