bebbo / gcc

Bebbo's gcc-6-branch for m68k-amigaos
GNU General Public License v2.0
33 stars 11 forks source link

Weird interaction of register parameters with stack management #14

Closed mheyer32 closed 6 years ago

mheyer32 commented 6 years ago

GCC tries to eliminate RTS (via early stack cleanup and JMP) when a function call is the last thing a function does. This does not work well with function calls that take register parameters as the early stack cleanup overwrites the registers needed for the function call.

(compiled with CFLAGS = -mcrt=clib2 -v -fbbb=Vabcefimprs -Ofast -fbaserel -fstrength-reduce -m68030 -m68881 -fomit-frame-pointer -Werror -Wimplicit -Wstrict-prototypes )

extern void I_MarkRect(REG(d0, int left), REG(d1, int top), REG(d2, int width), REG(d3, int height));

void V_DrawBlock(int x, int y, int scrn, int width, int height, byte* src)
{
    byte* dest;
    int height0;

    DEBUGPRINT(("V_DrawBlock %d, %d , %d, %d, %d, %p\n", x, y, scrn, width, height, src));

#ifdef RANGECHECK
    if (x < 0 || x + width > SCREENWIDTH || y < 0 || y + height > SCREENHEIGHT || (unsigned)scrn > 4) {
        I_Error("Bad V_DrawBlock");
    }
#endif

    height0 = height;
    dest = screens[scrn] + y * SCREENWIDTH + x;

    while (height--) {
        memcpy(dest, src, width);
        src += width;
        dest += SCREENWIDTH;
    }

    DEBUGPRINT(("V_DrawBlock height0 %d\n", height0));

    if (scrn == 0)
        I_MarkRect(x, y, width, height0); // I see wrong values passed into this function

}
00034d10 <_V_DrawBlock>:
   34d10:   598f            subql #4,%sp
   34d12:   48e7 3f36       moveml %d2-%d7/%a2-%a3/%a5-%a6,%sp@-
   34d16:   2e2f 0030       movel %sp@(48),%d7
   34d1a:   2f6f 0034 0028  movel %sp@(52),%sp@(40)
   34d20:   2a6f 0038       moveal %sp@(56),%a5               
   34d24:   242f 003c       movel %sp@(60),%d2                 
   34d28:   262f 0040       movel %sp@(64),%d3                 
   34d2c:   2a2f 0044       movel %sp@(68),%d5
   34d30:   2f05            movel %d5,%sp@-
   34d32:   2f03            movel %d3,%sp@-
   34d34:   2f02            movel %d2,%sp@-
   34d36:   2f0d            movel %a5,%sp@-
   34d38:   2f2f 0038       movel %sp@(56),%sp@-
   34d3c:   2f07            movel %d7,%sp@-
   34d3e:   4879 0003 4cc0  pea 34cc0 <_V_DrawPatchInDirectFlipped+0x214>
   34d44:   47f9 0003 cc20  lea 3cc20 <_printf>,%a3
   34d4a:   4e93            jsr %a3@
   34d4c:   2079 0000 674c  moveal 674c <___iob>,%a0
   34d52:   2f28 0004       movel %a0@(4),%sp@-
   34d56:   4df9 0003 d50c  lea 3d50c <_fflush>,%a6
   34d5c:   4e96            jsr %a6@
   34d5e:   4fef 0020       lea %sp@(32),%sp
   34d62:   4a87            tstl %d7
   34d64:   6d24            blts 34d8a <_V_DrawBlock+0x7a>
   34d66:   2839 0000 72b4  movel 72b4 <_SCREENWIDTH>,%d4
   34d6c:   2007            movel %d7,%d0
   34d6e:   d082            addl %d2,%d0
   34d70:   b880            cmpl %d0,%d4
   34d72:   6d16            blts 34d8a <_V_DrawBlock+0x7a>
   34d74:   4aaf 0028       tstl %sp@(40)
   34d78:   6d10            blts 34d8a <_V_DrawBlock+0x7a>
   34d7a:   202f 0028       movel %sp@(40),%d0
   34d7e:   d083            addl %d3,%d0
   34d80:   b0b9 0000 72b0  cmpl 72b0 <_SCREENHEIGHT>,%d0
   34d86:   6f00 0082       blew 34e0a <_V_DrawBlock+0xfa>
   34d8a:   4879 0003 4ce5  pea 34ce5 <_V_DrawPatchInDirectFlipped+0x239>
   34d90:   4eb9 0000 25f8  jsr 25f8 <_I_Error>
   34d96:   2839 0000 72b4  movel 72b4 <_SCREENWIDTH>,%d4
   34d9c:   588f            addql #4,%sp
   34d9e:   4c2f            046057
   34da0:   4800            nbcd %d0
   34da2:   0028 d887 d8b0  orib #-121,%a0@(-10064)
   34da8:   ddb0 0000       addl %d6,%a0@(00000000,%d0:w)
   34dac:   5cb8 4a83       addql #6,19075
   34db0:   6720            beqs 34dd2 <_V_DrawBlock+0xc2>
   34db2:   2c03            movel %d3,%d6
   34db4:   45f9 0003 d22c  lea 3d22c <_memcpy>,%a2
   34dba:   2f02            movel %d2,%sp@-
   34dbc:   2f05            movel %d5,%sp@-
   34dbe:   2f04            movel %d4,%sp@-
   34dc0:   4e92            jsr %a2@
   34dc2:   da82            addl %d2,%d5
   34dc4:   d8b9 0000 72b4  addl 72b4 <_SCREENWIDTH>,%d4
   34dca:   5386            subql #1,%d6
   34dcc:   4fef 000c       lea %sp@(12),%sp
   34dd0:   66e8            bnes 34dba <_V_DrawBlock+0xaa>
   34dd2:   2f03            movel %d3,%sp@-
   34dd4:   4879 0003 4cf5  pea 34cf5 <_V_DrawPatchInDirectFlipped+0x249>
   34dda:   4e93            jsr %a3@
   34ddc:   2079 0000 674c  moveal 674c <___iob>,%a0
   34de2:   2f28 0004       movel %a0@(4),%sp@-
   34de6:   4e96            jsr %a6@
   34de8:   4fef 000c       lea %sp@(12),%sp
   34dec:   4a8d            tstl %a5
   34dee:   6708            beqs 34df8 <_V_DrawBlock+0xe8>
   34df0:   4cdf 6cfc       moveml %sp@+,%d2-%d7/%a2-%a3/%a5-%a6
   34df4:   588f            addql #4,%sp
   34df6:   4e75            rts

   34df8:   222f 0028       movel %sp@(40),%d1                                            ; x
   34dfc:   2007            movel %d7,%d0                                                     ; y 
                                (assumed d2 and d3 still hold width/height??)

   // GCC tries to save on RTS's, cleans up its stack, but this seems to override d2/d3 and thus I_MarkRect gets called with wrong values. I think, in this case this optimization cannot be done in this way.
   34dfe:   4cdf 6cfc       moveml %sp@+,%d2-%d7/%a2-%a3/%a5-%a6
   34e02:   588f            addql #4,%sp
   34e04:   4ef9 0000 8868  jmp 8868 <_I_MarkRect>

   34e0a:   7004            moveq #4,%d0
   34e0c:   b08d            cmpl %a5,%d0
   34e0e:   648e            bccs 34d9e <_V_DrawBlock+0x8e>
   34e10:   4879 0003 4ce5  pea 34ce5 <_V_DrawPatchInDirectFlipped+0x239>
   34e16:   4eb9 0000 25f8  jsr 25f8 <_I_Error>
   34e1c:   2839 0000 72b4  movel 72b4 <_SCREENWIDTH>,%d4
   34e22:   588f            addql #4,%sp
   34e24:   6000 ff78       braw 34d9e <_V_DrawBlock+0x8e>
bebbo commented 6 years ago

please test

...
        movem.l #12340,-(sp)
        move.l d0,a3
        move.l d1,a5
        move.l d3,-(sp)
        move.l d2,-(sp)
        move.l d1,-(sp)
        move.l d0,-(sp)
        pea .LC0
        jsr _printf
...
mheyer32 commented 6 years ago

Nope, that didn't do it.

The offender in my example is a regular function doing a tail call into _I_MarkRect (which uses register parameters). Its not about register-parameter functions doing a tail call into a stack-parameter function (or was that broken, too?)

The above function still produce a broken call into I_MarkRect:

00034d14 <_V_DrawBlock>:
   34d14:   598f            subql #4,%sp
   34d16:   48e7 3f36       moveml %d2-%d7/%a2-%a3/%a5-%a6,%sp@-
   34d1a:   2e2f 0030       movel %sp@(48),%d7
   34d1e:   2f6f 0034 0028  movel %sp@(52),%sp@(40)
   34d24:   2a6f 0038       moveal %sp@(56),%a5
   34d28:   242f 003c       movel %sp@(60),%d2
   34d2c:   262f 0040       movel %sp@(64),%d3
   34d30:   2a2f 0044       movel %sp@(68),%d5
   34d34:   2f05            movel %d5,%sp@-
   34d36:   2f03            movel %d3,%sp@-
   34d38:   2f02            movel %d2,%sp@-
   34d3a:   2f0d            movel %a5,%sp@-
   34d3c:   2f2f 0038       movel %sp@(56),%sp@-
   34d40:   2f07            movel %d7,%sp@-
   34d42:   4879 0003 4cc4  pea 34cc4 <_V_DrawPatchInDirectFlipped+0x214>
   34d48:   47f9 0003 cc24  lea 3cc24 <_printf>,%a3
   34d4e:   4e93            jsr %a3@
   34d50:   2079 0000 674c  moveal 674c <___iob>,%a0
   34d56:   2f28 0004       movel %a0@(4),%sp@-
   34d5a:   4df9 0003 d510  lea 3d510 <_fflush>,%a6
   34d60:   4e96            jsr %a6@
   34d62:   4fef 0020       lea %sp@(32),%sp
   34d66:   4a87            tstl %d7
   34d68:   6d24            blts 34d8e <_V_DrawBlock+0x7a>
   34d6a:   2839 0000 72b4  movel 72b4 <_SCREENWIDTH>,%d4
   34d70:   2007            movel %d7,%d0
   34d72:   d082            addl %d2,%d0
   34d74:   b880            cmpl %d0,%d4
   34d76:   6d16            blts 34d8e <_V_DrawBlock+0x7a>
   34d78:   4aaf 0028       tstl %sp@(40)
   34d7c:   6d10            blts 34d8e <_V_DrawBlock+0x7a>
   34d7e:   202f 0028       movel %sp@(40),%d0
   34d82:   d083            addl %d3,%d0
   34d84:   b0b9 0000 72b0  cmpl 72b0 <_SCREENHEIGHT>,%d0
   34d8a:   6f00 0082       blew 34e0e <_V_DrawBlock+0xfa>
   34d8e:   4879 0003 4ce9  pea 34ce9 <_V_DrawPatchInDirectFlipped+0x239>
   34d94:   4eb9 0000 25f8  jsr 25f8 <_I_Error>
   34d9a:   2839 0000 72b4  movel 72b4 <_SCREENWIDTH>,%d4
   34da0:   588f            addql #4,%sp
   34da2:   4c2f            046057
   34da4:   4800            nbcd %d0
   34da6:   0028 d887 d8b0  orib #-121,%a0@(-10064)
   34dac:   ddb0 0000       addl %d6,%a0@(00000000,%d0:w)
   34db0:   5cb8 4a83       addql #6,19075
   34db4:   6720            beqs 34dd6 <_V_DrawBlock+0xc2>
   34db6:   2c03            movel %d3,%d6
   34db8:   45f9 0003 d230  lea 3d230 <_memcpy>,%a2
   34dbe:   2f02            movel %d2,%sp@-
   34dc0:   2f05            movel %d5,%sp@-
   34dc2:   2f04            movel %d4,%sp@-
   34dc4:   4e92            jsr %a2@
   34dc6:   da82            addl %d2,%d5
   34dc8:   d8b9 0000 72b4  addl 72b4 <_SCREENWIDTH>,%d4
   34dce:   5386            subql #1,%d6
   34dd0:   4fef 000c       lea %sp@(12),%sp
   34dd4:   66e8            bnes 34dbe <_V_DrawBlock+0xaa>
   34dd6:   2f03            movel %d3,%sp@-
   34dd8:   4879 0003 4cf9  pea 34cf9 <_V_DrawPatchInDirectFlipped+0x249>
   34dde:   4e93            jsr %a3@
   34de0:   2079 0000 674c  moveal 674c <___iob>,%a0
   34de6:   2f28 0004       movel %a0@(4),%sp@-
   34dea:   4e96            jsr %a6@
   34dec:   4fef 000c       lea %sp@(12),%sp
   34df0:   4a8d            tstl %a5
   34df2:   6708            beqs 34dfc <_V_DrawBlock+0xe8>
   34df4:   4cdf 6cfc       moveml %sp@+,%d2-%d7/%a2-%a3/%a5-%a6
   34df8:   588f            addql #4,%sp
   34dfa:   4e75            rts

// this is broken, restoring d2 and d3 before calling into I_MarkRect destroys its parameters in d2/d3
   34dfc:   222f 0028       movel %sp@(40),%d1
   34e00:   2007            movel %d7,%d0
   34e02:   4cdf 6cfc       moveml %sp@+,%d2-%d7/%a2-%a3/%a5-%a6
   34e06:   588f            addql #4,%sp
   34e08:   4ef9 0000 8868  jmp 8868 <_I_MarkRect>

   34e0e:   7004            moveq #4,%d0
   34e10:   b08d            cmpl %a5,%d0
   34e12:   648e            bccs 34da2 <_V_DrawBlock+0x8e>
   34e14:   4879 0003 4ce9  pea 34ce9 <_V_DrawPatchInDirectFlipped+0x239>
   34e1a:   4eb9 0000 25f8  jsr 25f8 <_I_Error>
   34e20:   2839 0000 72b4  movel 72b4 <_SCREENWIDTH>,%d4
   34e26:   588f            addql #4,%sp
   34e28:   6000 ff78       braw 34da2 <_V_DrawBlock+0x8e>
mheyer32 commented 6 years ago

Maybe find_tail_calls() is the right place to patch?

....
  /* Make sure the tail invocation of this function does not refer
     to local variables.  */
  FOR_EACH_LOCAL_DECL (cfun, idx, var)
    {
      if (TREE_CODE (var) != PARM_DECL
      && auto_var_in_fn_p (var, cfun->decl)
      && (ref_maybe_used_by_stmt_p (call, var)
          || call_may_clobber_ref_p (call, var)))
    return;
    }
mheyer32 commented 6 years ago

I tested the following code and it seems to work. Beware, though, I made a guess here and have no idea if this is correct (the gcc code has horrible formatting...). Also, the rejection should be smarter. We don't need to reject regarg-Functions that only use the scratch registers (d0/d1/a0/a1). But I have no idea how to implement that properly.

With the code below, I get the proper function calls:

   34df2:   4e75            rts

   34df4:   222f 0028       movel %sp@(40),%d1
   34df8:   2007            movel %d7,%d0
   34dfa:   4eb9 0000 8868  jsr 8868 <_I_MarkRect>
   34e00:   4cdf 6cfc       moveml %sp@+,%d2-%d7/%a2-%a3/%a5-%a6
   34e04:   588f            addql #4,%sp
static
bool func_is_using_regparms(const_tree func)
{
#ifdef TARGET_AMIGA
  if (DECL_ARGUMENTS (func))
    {
      tree attrs = TYPE_ATTRIBUTES(TREE_TYPE(func));
      if (attrs)
      {
          if (lookup_attribute ("regparm", attrs))
              return true;
          if (lookup_attribute ("asmregs", attrs))
              return true;
          if (amigaos_regparm > 0 && !lookup_attribute ("stkparm", attrs))
              return true;

      }
    }
#endif
    return false;
}

And then in find_tail_calls() after

  func = gimple_call_fndecl(call); 

insert

  if (func && func_is_using_regparms(func)) {
    return;
  }
bebbo commented 6 years ago

I move the code into a separate function and calling this function as you suggested. I also added the check for scratch regs.

mheyer32 commented 6 years ago

Something does not work with your patch. Now I'm back to

   34e3c:   222f 0028       movel %sp@(40),%d1
   34e40:   2007            movel %d7,%d0
   34e42:   4cdf 6cfc       moveml %sp@+,%d2-%d7/%a2-%a3/%a5-%a6
   34e46:   588f            addql #4,%sp
   34e48:   4ef9 0000 8868  jmp 8868 <_I_MarkRect>
mheyer32 commented 6 years ago

Ha, found the issue. From earlier testing I left REGARGS in the function declaration, which is just kind of redundant, but should not have been a problem I would think...

void REGARGS I_MarkRect(REG(d0, int left), REG(d1, int top), REG(d2, int width), REG(d3, int height));

If I remove REGARGS there, the call starts to work with your latest patch :-D What is the intended interaction between regargs and asm(#xn) ?

bebbo commented 6 years ago

AH!!!!

__regargs (your REGPARMS) means the compiler picks the registers.

The default is 2, so d0, d1, a0, a1 are chosen -> all scratch regs :-)

mheyer32 commented 6 years ago

So regargs trumps asm() ? I would have thought the other way around, i.e. in case of regargs let the compiler only assign a register if there's no asm directive for the parameter in question. (Needs documentation)

I'll test some more and close the issue if it works now. Maybe wait for Toni to confirm.

Anyway, awesome work, Bebbo! I REALLY appreciate your quick responses. I want to see this baby to fly eventually :-)

mheyer32 commented 6 years ago

Yeah, thinking about that... I believe the compiler should always respect the __asm directives for registers. If the programmer is very particular about the register, the compiler must not interfere with it.

Imagine switching to -mregparm (which is basically like adding __regargs to every function), you still want you assembly functions to work (expecting parameters in specific registers).

bebbo commented 6 years ago

I will modify the check, since asm should work together with regargs.

-mregparm already had a lower priority :-)

Thank you for your testing. Without folks like you I could never find all my (plus builtin) bugs.

bebbo commented 6 years ago

And I already did the basic checks:

__regargs int foo(int a, int b __asm("d0")) {

will produce an error since __regargs already picked d0.

x.c: In function ‘foo’:
x.c:1:15: error: two parameters allocated for one register
 __regargs int foo(int a, int b __asm("d0")) {

:-)

mheyer32 commented 6 years ago

Verified. Closing.