bebbo / gcc

Bebbo's gcc-6-branch for m68k-amigaos
GNU General Public License v2.0
33 stars 11 forks source link

Bad code generation #192

Closed mheyer32 closed 1 year ago

mheyer32 commented 1 year ago

I just ran winuaeenforcer with DOpus5 and got a hit that gets triggered by just clicking around and activating windows:


Enforcer Hit! Bad program
Illegal LONG READ from: 000001cd                PC: 404d49f0
Data: 0000fffb 0000000e 00000007 000000fb 400076a4 403e000d 00000107 403e0107
Addr: 000001c9 400076a4 4051c2bc 00000007 000000fc 40007640 40001564 400075cc
Stck: 405ea914 405ea914 403e83a0 80000000 403e83a0 405e4ab4 403e8350 40007688
Stck: 40017774 4051d1ec 40000000 000d000e 00fd0107 0000000c 00f00000 0018000e
Stck: 4051d1ec 4000770c 4051d1ec 405ea7a4 00000000 40018c90 4000770c 00fde860
Stck: 40018c90 4051d1ec 4000770c 011f0000 40007708 404d41f2 403e8350 405e4ab4
Stck: 405e4ac8 40007688 00f00000 405e4ab4 405ea954 400090b0 40017774 00fdad80
----> 00fd0107 - "ROM - shell 46.10 (12.9.2018)" Hunk 0000 Offset 000033cf
----> 00fde860 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000a6f4
----> 00fdad80 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00006c14
----> 00fd8fe2 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00004e76
----> 00fd8f18 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00004dac
----> 00fd47e4 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00000678
----> 00fd9a42 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 000058d6
----> 00fe2d0e - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000eba2
----> 00fe2cf6 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000eb8a
----> 00fe0992 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000c826
----> 00fe08a8 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000c73c
----> 00fea58a - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0001641e
----> 00feab9c - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00016a30
----> 00fdc790 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00008624
----> 00fddf38 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00009dcc
----> 00fdc624 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 000084b8
----> 00fdb398 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000722c
----> 00fdc0d0 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00007f64
----> 00fdc876 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0000870a
----> 00fdb05c - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00006ef0
----> 00fdaf7c - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00006e10
----> 00fdb380 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00007214
----> 00fdb30e - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 000071a2
----> 00fec4b8 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 0001834c
----> 00fec4c4 - "ROM - intuition 40.87 (9.9.2018)" Hunk 0000 Offset 00018358
----> 00fc77d6 - "ROM - input 45.2 (9.7.2018)" Hunk 0000 Offset 00000a4a
----> 00fc75bc - "ROM - input 45.2 (9.7.2018)" Hunk 0000 Offset 00000830
----> 00f82776 - "ROM - exec 46.45 (18.9.2018)" Hunk 0000 Offset 000026c0
----> 00ff8000 - "ROM - wbtag 39.1 (20.4.1992)" Hunk 0000 Offset 0000477e
----> 00fc2c46 - "ROM - ramdrive 45.3 (14.7.2018)" Hunk 0000 Offset 000005b2
----> 00fc2b5e - "ROM - ramdrive 45.3 (14.7.2018)" Hunk 0000 Offset 000004ca
----> 00fc2aee - "ROM - ramdrive 45.3 (14.7.2018)" Hunk 0000 Offset 0000045a
----> 00fc27a6 - "ROM - ramdrive 45.3 (14.7.2018)" Hunk 0000 Offset 00000112
----> 00fc278a - "ROM - ramdrive 45.3 (14.7.2018)" Hunk 0000 Offset 000000f6
----> 00fc91c4 - "ROM - trackdisk 45.1 (14.7.2018)" Hunk 0000 Offset 0000081c
----> 00fc90aa - "ROM - trackdisk 45.1 (14.7.2018)" Hunk 0000 Offset 00000702
----> 00fc90ae - "ROM - trackdisk 45.1 (14.7.2018)" Hunk 0000 Offset 00000706
----> 00fc9084 - "ROM - trackdisk 45.1 (14.7.2018)" Hunk 0000 Offset 000006dc
----> 00fc1fd0 - "ROM - carddisk 40.1 (12.2.1993)" Hunk 0000 Offset 00000270
----> 00fc1fb0 - "ROM - carddisk 40.1 (12.2.1993)" Hunk 0000 Offset 00000250
----> 00fc1fb4 - "ROM - carddisk 40.1 (12.2.1993)" Hunk 0000 Offset 00000254
----> 00fc1d7a - "ROM - carddisk 40.1 (12.2.1993)" Hunk 0000 Offset 0000001a
404d49dc :   5b88                 SUBA.L #$05,a0
404d49de :   3642                 MOVEA.W d2,a3
404d49e0 :   b7c8                 CMPA.L a0,a3
404d49e2 :   6d08                 BLT.B #$08
404d49e4 :   342d ffbe            MOVE.W (a5,-$0042) == $400075fe [000e],d2
404d49e8 :   5f42                 SUB.W #$07,d2
404d49ea :   3642                 MOVEA.W d2,a3
404d49ec :   307c 01c9            MOVEA.W #$01c9,a0
404d49f0 : * d1e8 0004            ADDA.L (a0,$0004) == $000001cd [f819e000],a0
404d49f4 :   e9d0 00c1            BFEXTU.L #$00c1,(a0)
404d49f8 :   0c41 0009            CMP.W #$0009,d1
404d49fc :   5ec1                 SGT.B d1
404d49fe :   4401                 NEG.B d1
404d4a00 :   c001                 AND.B d1,d0
404d4a02 :   0a00 0001            EOR.B #$01,d0
404d4a06 :   1b40 ffc1            MOVE.B d0,(a5,-$003f) == $40007601 [fd]
404d4a0a :   206d 0014            MOVEA.L (a5,$0014) == $40007654 [40007688],a0
Name: "input.device"

I found this code in dopus5.library in image_class.c, which is partly broken and partly weird:


void image_draw(Class *cl, struct Image *image, BoopsiImageData *data, struct impDraw *draw)

        if (data->data->flags & LIBDF_3DLOOK && box.Height > 9)

   19514:   307c 01c9       movea.w #457,a0           <<< what is this?
   19518:   d1e8 0004       adda.l 4(a0),a0
   1951c:   e9d0 00c1       bfextu (a0),3,1,d0            <<< why use bfextu when testing for a single bit?

https://github.com/mheyer32/dopus5allamigas/blob/master/source/Library/image_class.c#L329

I built DOpus5 via

make -j8 release debug=no

bebbo commented 1 year ago

I guess your source should have braces:

  if ((data->data->flags & LIBDF_3DLOOK) && box.Height > 9)

The bogus code looks like some reloading issue (wrong order)... takes more time to investigate

mheyer32 commented 1 year ago

I guess what should have happened is something like

move.l 4(a0),a0
adda.w #457,a0
bebbo commented 1 year ago

I am able to reproduce that issue

m68k-amigaos-gcc image_class.c -m68020-60 -mtune=68030 -fshort-enums -D__amigaos3__ -D__USE_SYSBASE -D__NO_NET_API -DUSE_64BIT -DCOMPDATE=18.11.2022 -DNDEBUG -Os -fomit-frame-pointer -noixemul -I./../ -Iinit/../ -I/home/stefan/amiga-stuff/dopus5allamigas/source -I/home/stefan/amiga-stuff/dopus5allamigas/source/Include  -Wall -Wno-pointer-sign -Wno-attributes -Wno-int-conversion -Wno-int-to-pointer-cast -Wno-pointer-to-int-cast -S
bebbo commented 1 year ago
(insn 722 721 723 35 (set (subreg:SI (reg:QI 1026) 0)
        (zero_extract:SI (mem:QI (plus:SI (mem/f:SI (plus:SI (reg/v/f:SI 866 [ data ])
                            (const_int 4 [0x4])) [3 data_62(D)->data+0 S4 A16])
                    (const_int 457 [0x1c9])) [7 _802->flags+3 S1 A8])
            (const_int 1 [0x1])
            (const_int 3 [0x3]))) /tmp/compiler-explorer-compiler20221018-3670000-1fe1xki.r29p/example.c:8962 359 {*extzv_bfextu_mem2}
     (expr_list:REG_DEAD (reg/v/f:SI 866 [ data ])
        (nil)))
Reloads for insn # 722
Reload 0: reload_in (SI) = (reg/v/f:SI 866 [ data ])
    ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 1)
    reload_in_reg: (reg/v/f:SI 866 [ data ])
    reload_reg_rtx: (reg:SI 8 a0)
Reload 1: reload_in (SI) = (plus:SI (mem/f:SI (plus:SI (reg/v/f:SI 866 [ data ])
                                                            (const_int 4 [0x4])) [3 data_62(D)->data+0 S4 A16])
                                                    (const_int 457 [0x1c9]))
    ADDR_REGS, RELOAD_FOR_INPUT (opnum = 1), inc by 1
    reload_in_reg: (plus:SI (mem/f:SI (plus:SI (reg/v/f:SI 866 [ data ])
                                                            (const_int 4 [0x4])) [3 data_62(D)->data+0 S4 A16])
                                                    (const_int 457 [0x1c9]))
    reload_reg_rtx: (reg:SI 8 a0)
(insn 1777 721 1779 35 (set (reg:SI 8 a0)
        (mem/f/c:SI (plus:SI (reg/f:SI 13 a5)
                (const_int 16 [0x10])) [61 data+0 S4 A16])) /tmp/compiler-explorer-compiler20221018-3670000-13vuxlv.mx17/example.c:8962 39 {*movsi_m68k}
     (nil))
(insn 1779 1777 1780 35 (set (reg:SI 8 a0)
        (const_int 457 [0x1c9])) /tmp/compiler-explorer-compiler20221018-3670000-13vuxlv.mx17/example.c:8962 39 {*movsi_m68k}
     (nil))
(insn 1780 1779 722 35 (set (reg:SI 8 a0)
        (plus:SI (reg:SI 8 a0)
            (mem/f:SI (plus:SI (reg:SI 8 a0)
                    (const_int 4 [0x4])) [3 data_62(D)->data+0 S4 A16]))) /tmp/compiler-explorer-compiler20221018-3670000-13vuxlv.mx17/example.c:8962 135 {*addsi3_internal}
     (expr_list:REG_EQUIV (plus:SI (mem/f:SI (plus:SI (reg:SI 8 a0)
                    (const_int 4 [0x4])) [3 data_62(D)->data+0 S4 A16])
            (const_int 457 [0x1c9]))
        (nil)))
(insn 722 1780 723 35 (set (reg:SI 0 d0 [orig:1026+-3 ] [1026])
        (zero_extract:SI (mem:QI (reg:SI 8 a0) [7 _802->flags+3 S1 A8])
            (const_int 1 [0x1])
            (const_int 3 [0x3]))) /tmp/compiler-explorer-compiler20221018-3670000-13vuxlv.mx17/example.c:8962 359 {*extzv_bfextu_mem2}
     (nil))
bebbo commented 1 year ago

changed reload type to RELOAD_FOR_OPERAND_ADDRESS =>

    move.w #457,a1
    add.l (4,a0),a1
    bfextu (a1){#3:#1},d0
mheyer32 commented 1 year ago

I just checked, its curious that these particular instructions were the only affected in the whole library!

mheyer32 commented 1 year ago

And whats the deal with the use of bfextu there?

mheyer32 commented 1 year ago

Reopening... the same lines of code still produce an enforcer hit.

I think whats happens is that even if

        if (data->data->flags & LIBDF_3DLOOK && box.Height > 9)
   19514:   327c 01c9       movea.w #457,a1
   19518:   d3e8 0004       adda.l 4(a0),a1
   1951c:   e9d1 00c1       bfextu (a1),3,1,d0

is correct code now, however a0 is no longer pointing to the 'data' as a0 had been written to right above:


        if (y > maxy - 6)
   194fe:   206d ffba       movea.l -70(a5),a0 <<<<
        y = maxy - (box.Height >> 1) + 2;
   19502:   d445            add.w d5,d2
        if (y > maxy - 6)
   19504:   5b88            subq.l #5,a0       <<<<
   19506:   3642            movea.w d2,a3
   19508:   b7c8            cmpa.l a0,a3
bebbo commented 1 year ago

it's a tad more tricky, since the generated address is per se a valid address, but that address kind is not allowed for the insn...

bebbo commented 1 year ago

making progress and getting the order of reloads right

    move.w #457,a1
    move.l (16,a5),a0
    add.l (4,a0),a1
    bfextu (a1){#3:#1},d0
mheyer32 commented 1 year ago

Good thing, I didn't encounter Enforcer hits so far! But it looks like this change had some more undesirable side effects. Almost all binaries of DOpus are now a bit larger. Looking through the disassembly of dopus.library I noticed in particular one thing:

image

(right is the new code) Notice how the same value gets now reloaded from the stack over and over. And I wonder, is this even legal code? Does A7 allow to be accessed at odd offsets?

bebbo commented 1 year ago

That reload should not occur, I'll look into it. Guess it needs some logic to apply that change only if the insn does not support double indirect addressing.

And a7 can be used as any other address registers, thus odd addresses are always ok for 68020+ and with .b ok for all.

bebbo commented 1 year ago

Good thing, I didn't encounter Enforcer hits so far! But it looks like this change had some more undesirable side effects. Almost all binaries of DOpus are now a bit larger. Looking through the disassembly of dopus.library I noticed in particular one thing:

image

(right is the new code) Notice how the same value gets now reloaded from the stack over and over. And I wonder, is this even legal code? Does A7 allow to be accessed at odd offsets?

The left side is generated with -Os, right side with -O2. And I agree: using mem moves is unfortunate.

bebbo commented 1 year ago

It happened that the mem costs went negative !?!? So I disabled that adjustment.

mheyer32 commented 1 year ago

The new code generated in DOpus5 is a mixed bag. It seems to change many places to reload less (and be more aggressive using more registers), but it also introduces others where new reloads gets inserted. On top, the compiler now crashes compiling DoomAttack:

m68k-amigaos-gcc -noixemul -msmall-code -g -ggdb -Werror -Wimplicit -Wstrict-prototypes -D__BIG_ENDIAN__ -DNORMALUNIX -DAMIGA -D__NO_NET_API -include "doomdef.h" -I./plugins/Include -I./ -DNDEBUG -Ofast -ffast-math -fomit-frame-pointer -m68060 -Dversion060 -c g_game.c -o /home/matze/DoomAttack/gnudoom/build//060/g_game.o
(insn 397 1485 399 36 (set (mem:SI (plus:SI (mem/f/c:SI (reg:SI 8 a0) [4 MEM[(struct player_t *)&players][0].mo+0 S4 A16])
                (const_int 104 [0x68])) [7 _608->flags+0 S4 A16])
        (and:SI (mem:SI (plus:SI (mem/f/c:SI (reg:SI 9 a1) [4 MEM[(struct player_t *)&players][0].mo+0 S4 A16])
                    (const_int 104 [0x68])) [7 _608->flags+0 S4 A16])
            (const_int -262145 [0xfffffffffffbffff]))) g_game.c:903 231 {andsi3_internal}
     (nil))
g_game.c: In function 'G_Ticker':
g_game.c:868:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 397 1485 399 36 (set (mem:SI (plus:SI (mem/f/c:SI (reg:SI 8 a0) [4 MEM[(struct player_t *)&players][0].mo+0 S4 A16])
                (const_int 104 [0x68])) [7 _608->flags+0 S4 A16])
        (and:SI (mem:SI (plus:SI (mem/f/c:SI (reg:SI 9 a1) [4 MEM[(struct player_t *)&players][0].mo+0 S4 A16])
                    (const_int 104 [0x68])) [7 _608->flags+0 S4 A16])
            (const_int -262145 [0xfffffffffffbffff]))) g_game.c:903 231 {andsi3_internal}
     (nil))
g_game.c:868:1: internal compiler error: in extract_constrain_insn, at recog.c:2199
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
make: *** [Makefile:208: /home/matze/DoomAttack/gnudoom/build//060/g_game.o] Error 1
bebbo commented 1 year ago

the forced reload, leads to two reloads resulting in different registers: a0 and a1. There only one reload should be used...

bebbo commented 1 year ago

please test again :-)

mheyer32 commented 1 year ago

Third time is the charm! I think this time all changes in the code are either neutral or an improvement, plus DoomAttack compiles again!

bebbo commented 1 year ago

And whats the deal with the use of bfextu there?

It's the gcc choice to evaluate this with less jumps... ... changing this

mheyer32 commented 1 year ago

I'll check it out tomorrow.

Tonight I gave ScummVM a go for a first time in a long time and it behaves super weird..it starts but exists before it even reaches the main menu. With bgdbserver I can follow it to https://github.com/mheyer32/scummvm-amigaos3/blob/master/base/main.cpp#L426 but I can't single step any further than this and the executable will just exit.

I don't know if this is any related to the reload behavior or if it is an entirely different issue. Sorry for moving the goal posts every time when you make progress :-)

mheyer32 commented 1 year ago

I can definitely see how many of these kind of constructs:

        if (screen || window)
    4d12:   2009            move.l a1,d0
    4d14:   56c0            sne d0
    4d16:   4400            neg.b d0
    4d18:   220a            move.l a2,d1
    4d1a:   56c1            sne d1
    4d1c:   4401            neg.b d1
    4d1e:   8001            or.b d1,d0

get converted into simpler tests, with potentially a beq/bne inbetween:

        if (screen || window)
    4ca6:   2009            move.l a1,d0
    4ca8:   6604            bne.s 4cae 4cae <L_ShowProgressWindow+0x10>
    4caa:   200a            move.l a2,d0
    4cac:   6708            beq.s 4cb6 4cb6 <L_ShowProgressWindow+0x18>

and would you look at that:

    if (data->data->flags & LIBDF_3DLOOK && box.Height > 9)
   19514:   307c 01c9       movea.w #457,a0
   19518:   d1e8 0004       adda.l 4(a0),a0
   1951c:   e9d0 00c1       bfextu (a0),3,1,d0
   19520:   0c41 0009       cmpi.w #9,d1
   19524:   5ec1            sgt d1
   19526:   4401            neg.b d1
   19528:   c001            and.b d1,d0
   1952a:   0a00 0001       eori.b #1,d0
   1952e:   1b40 ffc1       move.b d0,-63(a5)

became :

        if (data->data->flags & LIBDF_3DLOOK && box.Height > 9)
   19324:   7e01            moveq #1,d7
   19326:   0833 0004 0162  btst #4,([4,a3],457)    <<< woah! All address calculation and the bit test stuffed into single instruction
   1932c:   0004 01c9 
   19330:   670a            beq.s 1933c 1933c <image_draw+0x7fa>
   19332:   0c41 0009       cmpi.w #9,d1
   19336:   5fc0            sle d0
   19338:   1e00            move.b d0,d7
   1933a:   4407            neg.b d7
bebbo commented 1 year ago

I just compiled ScummVM with static plugins and the games here (phanatasmagoria, space quest) seem to work

mheyer32 commented 1 year ago

Yeah, I believe this was a pebcak. I might not have had AHI installed on the test config. Sorry for the churn!