bebbo / gcc

Bebbo's gcc-6-branch for m68k-amigaos
GNU General Public License v2.0
31 stars 10 forks source link

-fbbb=e causing issues with -flto #83

Closed mheyer32 closed 5 years ago

mheyer32 commented 5 years ago

I ran into some new issue with ScummVM. This time I was able to track it down do a combination of -fbbb=e and -flto.

I have no smoking gun yet. I looked through the disassembly between the two version (one compiled with-flto and -fbbb=e, the other with -flto and -fbbb=-. The differences seem small. 'e' seems to try to remove redundant assignments (i.e. clearing a register twice or assigning the same value twice) and useless assignments (mostly it removes restoring SP when the function exits through UNLNK anyway.

I'll report when I know more.

bebbo commented 5 years ago

I added a switch to objdump -N

m68k-amigaos-objdump.exe -N --no-show-raw-insn -m68040 -d scummvm >s.txt

maybe this will help with the hunt.

bebbo commented 5 years ago
mheyer32 commented 5 years ago

Thanks, that is useful! so far I've been helping myself with sed to get rid of the hex codes - but it still looked messy and was messing up instructions at times.

bebbo commented 5 years ago

I hacked objdump a tad more - have a look^^

mheyer32 commented 5 years ago

Hehe, I think I just complained about the failed incremental update ;-) clean, clean-prefix, here I come...

bebbo commented 5 years ago

this looks odd:

 __ZN6Common6String11decRefCountEPi.part.15:
    link.w a5,#0
    move.l a3,-(sp)
    move.l a2,-(sp)
    movea.l 8(a5),a3
    movea.l 12(a5),a2
    tst.l a2
    beq.s  _L123
    jsr  __ZN6Common19lockMemoryPoolMutexEv
    movea.l  __ZN6TinselL8g_sgDataE.lto_priv.8575+0x10f08,a0
 _L119:
    tst.l a0
    beq.s  _L125
    move.l 16(a0),(a2)

vs.

 __ZN6Common6String11decRefCountEPi.part.15:
    link.w a5,#0
    move.l a3,-(sp)
    move.l a2,-(sp)
    movea.l 8(a5),a3
    movea.l 12(a5),a2
    tst.l a2
    beq.s  _L123
    jsr  __ZN6Common19lockMemoryPoolMutexEv
 _L119:
    lsr.l d6,d4
    tst.l a0
    beq.s  _L125
    move.l 16(a0),(a2)

=> gah - that's caused by inserting labels at wrong locations... uhg

mheyer32 commented 5 years ago

The new asm output from objdump omits the angle brackets around referenced labels:

bcc.s Common::String::initWithCStr(char const*, unsigned int)+0x38

This makes it harder to sort them out from a diff with a regexp.

mheyer32 commented 5 years ago

You're talking about the weird change to d4, not saved to the stack?

 _L119:
    lsr.l d6,d4

Is this a problem with the disassembler or the compiler placing the label 'into' the movea.l __ZN6TinselL8g_sgDataE.lto_priv.8575+0x10f08,a0 ?

bebbo commented 5 years ago

i

You're talking about the weird change to d4, not saved to the stack?

 _L119:
  lsr.l d6,d4

Is this a problem with the disassembler or the compiler placing the label 'into' the movea.l __ZN6TinselL8g_sgDataE.lto_priv.8575+0x10f08,a0 ?

it's a disassembler issue.

The new asm output from objdump omits the angle brackets around referenced labels:

brackets will be back if you select to demangle the function names.

mheyer32 commented 5 years ago

I spent the day scrolling through a 17mb diff... still no dice. Some removals of SP maintenance looks suspicious, but since it happened in the middle of some functions, I was not able to determine if the optimization was correct or not (couldn't follow the jumps to the end of the functions)

This issue is harder to debug than the previous one, because as soon as I add debug prints etc, it changes the outcome.

brackets will be back if you select to demangle the function names.

I thought I did use -C to demangle the name(?)

bebbo commented 5 years ago

you have to work with 2 sets of object files and a binary search:

  1. A: the ok file
  2. B: the bad files
  3. copy A files to C
  4. overwrite files in C with files from B
  5. link
  6. test, ok: goto 4 and use more files from B, bad, goto 3 and use less files B

until you identified one file. Compare only the asm of this files' A and B version.

I thought I did use -C to demangle the name(?)

Yes you do, and with -C the brackets will be back

mheyer32 commented 5 years ago

Do you have a trick up your sleeve to prevent everything from rebuilding the .o files after copying the project around?

bebbo commented 5 years ago

Do you have a trick up your sleeve to prevent everything from rebuilding the .o files after copying the project around?

invoke the linker without make

mheyer32 commented 5 years ago

Ok, that worked. I have yet to do the bisecting, though. Do you usually 'flatten' the directory tree, i.e. have all .o file in just one directory?

mheyer32 commented 5 years ago

So, the problem is that the good .o files can be compiled into a bad program. This is due to code being generated at link time when -flto is in use. I.e. if I use the good .o files and change the -fbbb=- to-fbbb=e for the linker command line, it'll produce a bad program

How am I going to debug this? I'm open for ideas...

bebbo commented 5 years ago

hm, maybe -save-temps might help to keep the generated asm files. Then combine these to reduce the source to compare... ... maybe also -flto-partition=1to1 or -flto-partition=one or -flto-partition=max is helpful.

bebbo commented 5 years ago

the e option does varios things, maybe I could make these selectable, to find the bad one...

mheyer32 commented 5 years ago

-save-temps does produce the .s files (with labels) and they are different for -fbbb=- and -fbbb=e

Lets see, if I can find something.

mheyer32 commented 5 years ago

This might be something:

ScriptPatcher::processScript() has this line of code:

_isMacSci11 = (g_sci->getPlatform() == Common::kPlatformMacintosh && getSciVersion() >= SCI_VERSION_1_1);

Which gets translated into the good code below:

    .stabs  "engines/sci/engine/script_patches.cpp",132,0,0,.Ltext318
.Ltext318:
    .stabd  68,0,12111
    clr.b d0
.LBB2533:
.LBB2532:
    cmp.b #4,208(a0)      >>>>>> (g_sci->getPlatform() == Common::kPlatformMacintosh
    jne .L831
    clr.l d0                        <<<<<< lucky, we're clearing all of d0
    move.b __ZN3Sci12s_sciVersionE,d0
    move.b #7,d1                  <<<<<<<< we're lucky, d1 gets moveq'd some lines up
    cmp.l d0,d1          >>>>>>>>  getSciVersion() >= SCI_VERSION_1_1 (8),    I think this should have been cmp.b
    slt d0
    neg.b d0
.L831:
    move.b d0,8(a3)
the bad code removes the clr.l d0:

    .stabs  "engines/sci/engine/script_patches.cpp",132,0,0,.Ltext318
.Ltext318:
    .stabd  68,0,12111
    clr.b d0
.LBB2533:
.LBB2532:
    cmp.b #4,208(a0)
    jne .L831
    move.b __ZN3Sci12s_sciVersionE,d0
    move.b #7,d1
    cmp.l d0,d1            <<<<<<< now we're at the mercy, what is in the upper bytes of d0
    slt d0
    neg.b d0
.L831:

I don't know if the bug is

  1. the comparison should have been at byte level (SCI_VERSION_1_1 is declared as enum SciVersion : byte ) or
  2. the clearing of d0 is not redundant and should not have been removed.

I will try to figure out if this is the issue. But it looks rather suspicious.

mheyer32 commented 5 years ago

Turning the SciVersion enum back into a regular enum (mind you, I'm not using -fshort-enums in the example) turns the code into:

    clr.b d0
    cmpi.b #4,208(a0)
    bne.s  <Sci::ScriptPatcher::processScript(unsigned short, Sci::SciSpan<unsigned char>)+0x54>
    moveq #7,d0
    cmp.l  <Sci::s_sciVersion>,d0      <<<< legal
    slt d0
    neg.b d0
    move.b d0,8(a3)

which is working

mheyer32 commented 5 years ago

Recompiling without -flto -fbbb=e -fshort-enums makes the issue go away, but the comparison still happens at .l level. Taking out -flto just happens to reorganize the code a bit, so d0 gets cleared by code prior in the function.

So, I now think, neither -flto nor -fbbb=e are actually to blame... they rather made the problem with short enums visible.

mheyer32 commented 5 years ago

I copied an excerpt into the compiler explorer, but this time the compiler included a andi.l #255, d0 or 'clr.l' before the comparison.... I guess this the clr.l that gets removed in the full version via -fbbb=e

https://franke.ms/cex/z/QNFg0w (updated to include all compiler flags currently set in my ScummVM build)

mheyer32 commented 5 years ago

So I guess the question is still a) why compare a byte sized enum via cmp.l b) if a) is intended, why -fbbb=e thinks it can remove the clr.l as shown below

    .globl  __ZN3Sci13ScriptPatcher13processScriptEtNS_7SciSpanIhEE
__ZN3Sci13ScriptPatcher13processScriptEtNS_7SciSpanIhEE:
    .stabd  46,0,0
    .stabd  68,0,11950
.LFBB64:
    lea (-16,sp),sp
    movem.l a6/a5/a4/a3/a2/d7/d6/d5/d4/d3/d2,-(sp)
    move.l 64(sp),a3
    move.l 72(sp),a5
    move.w 70(sp),46(sp)
    .stabd  68,0,11954
    lea __ZN3Sci5g_sciE,a4
    move.l (a4),a1
    move.l 224(a1),a2
    move.l a2,d0
    subq.l #1,d0
    moveq #74,d1
    cmp.l d0,d1
    jcs .L829
    move.l _CSWTCH.155.lto_priv.6265(d0.l*4),d2
.LBB2503:
    .stabd  68,0,12110
    jeq .L829
.LBB2504:
.LBB2505:
.LBB2506:
    .stabs  "./engines/sci/sci.h",132,0,0,.Ltext317
.Ltext317:
    .stabd  68,0,259
    move.l 220(a1),a0
.LBE2506:
.LBE2505:
.LBE2504:
.LBE2503:
    .stabs  "engines/sci/engine/script_patches.cpp",132,0,0,.Ltext318
.Ltext318:
    .stabd  68,0,12111
    clr.b d0
.LBB2533:
.LBB2532:
    cmp.b #4,208(a0)
    jne .L831

    clr.l d0      <<<<<<<< -fbbb=e removes this clear

    move.b __ZN3Sci12s_sciVersionE,d0
    move.b #7,d1
    cmp.l d0,d1
    slt d0
    neg.b d0
.L831:
    move.b d0,8(a3)
bebbo commented 5 years ago

So I guess the question is still a) why compare a byte sized enum via cmp.l b) if a) is intended, why -fbbb=e thinks it can remove the clr.l as shown below

a) cmp.l is used since the compiler MUST extend the comparison to the longer type and SCI_VERSION_1_1 is of type int b) the fault is to remove the clr.l => the mode (byte/word/long) must be checked to determine if an assignment is superfluous or not. Here the acts like: I see an assignment to d0: move.b SCI_VERSION_1_1,d0 let's discard previous assignments...

... working on it

mheyer32 commented 5 years ago

a) cmp.l is used since the compiler MUST extend the comparison to the longer type and SCI_VERSION_1_1 is of type int

Can you elaborate on that? Why would one need to extend the values if the storage type is an unsigned byte?

https://franke.ms/cex/z/ahpPHn

bebbo commented 5 years ago

a) cmp.l is used since the compiler MUST extend the comparison to the longer type and SCI_VERSION_1_1 is of type int

Can you elaborate on that? Why would one need to extend the values if the storage type is an unsigned byte?

https://franke.ms/cex/z/ahpPHn

I'm only citing stuff I don't understand...

mheyer32 commented 5 years ago

-- If both operands have the same type, then no further conversion is needed.

enum SciVersion : unsigned char {
    SCI_VERSION_NONE,
    SCI_VERSION_0_EARLY, // KQ4 early, LSL2 early, XMAS card 1988
    SCI_VERSION_0_LATE, // KQ4, LSL2, LSL3, SQ3 etc
    SCI_VERSION_01, // KQ1 and multilingual games (S.old.*)
    SCI_VERSION_1_EGA_ONLY, // SCI 1 EGA with parser (i.e. QFG2 only)
    SCI_VERSION_1_EARLY, // KQ5 floppy, SQ4 floppy, XMAS card 1990, Fairy tales, Jones floppy
    SCI_VERSION_1_MIDDLE, // LSL1, Jones CD
    SCI_VERSION_1_LATE, // Dr. Brain 1, EcoQuest 1, Longbow, PQ3, SQ1, LSL5, KQ5 CD
    SCI_VERSION_1_1, // Dr. Brain 2, EcoQuest 1 CD, EcoQuest 2, KQ6, QFG3, SQ4CD, XMAS 1992 and many more
};

SciVersion s_sciVersion;

/**
 * Convenience function to obtain the active SCI version.
 */
static inline SciVersion getSciVersion() {
    extern SciVersion s_sciVersion;
    return s_sciVersion;
}

int square(int num) 
{
    isNewVersion = getSciVersion() >= SCI_VERSION_1_1;
}

getSciVersion() returns SciVersion, and SCI_VERSION_1_1 is of type SciVersion; thus they have the same type, no?

bebbo commented 5 years ago

enum constants are of type int

comparison between SciVersion (1 byte) and int (4 bytes) -> extend SciVersion to int for comparison...

bebbo commented 5 years ago

with gcc-6.5 (amiga or elf version): static inline char getSciVersion() results into a byte compare static SciVersion getSciVersion() results into a long compare

with gcc-8.2 or gcc-9 (elf version) both yield a byte compare

=> gcc-6 is broken!?

bebbo commented 5 years ago

(guess I need a gcc-9 amiga version...)

bebbo commented 5 years ago

I still can't create the same code you posted, but I guess this issue is related to

    clr.b d0
...
    clr.l d0                        <<<<<< lucky, we're clearing all of d0

I added a check for mode size == 4 -> clr.b will no longer mark d0 having value 0, since int values don't have a mode...

mheyer32 commented 5 years ago

I will update with the parameters/changes to build ScummVM tonight.

mheyer32 commented 5 years ago

with gcc-6.5 (amiga or elf version): static inline char getSciVersion() results into a byte compare static SciVersion getSciVersion() results into a long compare

with gcc-8.2 or gcc-9 (elf version) both yield a byte compare

=> gcc-6 is broken!?

I'm by no means a language lawyer, but I think "enum constants are of type int" is C, while in C++ they are distinct types.

mheyer32 commented 5 years ago

I think you fixed it:

This is with -Os -flto -fshort-enums -fbbb=e

 <Sci::ScriptPatcher::processScript(unsigned short, Sci::SciSpan<unsigned char>)>:
    lea -16(sp),sp
    movem.l d2-d7/a2-a6,-(sp)
    movea.l 64(sp),a3
    movea.l 72(sp),a5
    move.w 70(sp),46(sp)
    lea  <Sci::g_sci>,a2
    movea.l (a2),a1
    move.b 220(a1),d0
    move.b d0,d3
    subq.b #1,d0
    cmpi.b #74,d0
    bhi.w  <Sci::ScriptPatcher::processScript(unsigned short, Sci::SciSpan<unsigned char>)+0x1cc>
    andi.l #255,d0
    move.l (998012,d0.l*4),d2
    beq.w  <Sci::ScriptPatcher::processScript(unsigned short, Sci::SciSpan<unsigned char>)+0x1cc>
    movea.l 216(a1),a0
    clr.b d0
    cmpi.b #4,205(a0)
    bne.s  <Sci::ScriptPatcher::processScript(unsigned short, Sci::SciSpan<unsigned char>)+0x5e>
    clr.l d0
    move.b  <Sci::s_sciVersion>,d0
    moveq #7,d1
    cmp.l d0,d1
    slt d0
    neg.b d0
    move.b d0,8(a3)
mheyer32 commented 5 years ago

Thanks again! Closing!