bebbo / gcc

Bebbo's gcc-6-branch for m68k-amigaos
GNU General Public License v2.0
33 stars 11 forks source link

More broken code generation #194

Closed mheyer32 closed 1 year ago

mheyer32 commented 1 year ago

Looking into Enforcer hits with: https://github.com/mheyer32/ROTT

This time, a piece of code is suspicious in rt_actor.c, function "CheckLine" caused read access at random(?) memory locations

   1aef4:   0280 0000 03ff  andi.l #1023,d0
                        } else if (maskobjlist[value & 0x3ff]->flags & MW_WEAPONBLOCKING) {
   1aefa:   3030 0d12 000e  move.w ([0,a0,d0.l*4],14),d0        <<<<<<< a0 doesn't seem to be assigned
   1af00:   0240 0041       andi.w #65,d0
   1af04:   6600 ff14       bne.w 1ae1a 1ae1a <CheckLine+0x1f0>

a0 seems to be never actually be assigned to the maskobjlist array. Also, which register is being used seems random. when compiling the whole executable (instead of just the single file) and disassembled that, I got:

   225ec:   0280 0000 03ff  andi.l #1023,d0
                        } else if (maskobjlist[value & 0x3ff]->flags & MW_WEAPONBLOCKING) {
   225f2:   3031 0d12 000e  move.w ([0,a1,d0.l*4],14),d0 M= <<< whats in a1?
   225f8:   0240 0041       andi.w #65,d0
   225fc:   6670            bne.s 2266e 2266e <StopWind+0x1ac6>

The file in question can be built and disassembled via:

make  obj/060/rott/rt_actor.o -B
m68k-amigaos-objdump -d -C -S obj/060/rott/rt_actor.o > rt_actor.asm

what is impressive, though, is how the compiler boiled down the two conditions into just one test:


                        if (maskobjlist[value & 0x3ff]->flags & MW_SHOOTABLE) {
#if (0)
                            SoftError("\nfailed from shootable mask");
#endif
                            return false;
                        } else if (maskobjlist[value & 0x3ff]->flags & MW_WEAPONBLOCKING) {
#if (0)
                            SoftError("\nfailed from block mask");
#endif
                            return false;
                        }

by just checking against 0x41  instead of 0x01 and 0x40 separately!
mheyer32 commented 1 year ago

I tested devel1 last night and it seemed to have fixed that particular issue. I noticed way less "read from invalid memory locations" in ROTT. I also recompiled DOpus5 and could not find regressions there... sometimes instructions got a bit reshuffled, but that has no ill effect. Th upshot: the latest toolchain produces slightly smaller (up to a few hundred bytes) output files, likely attributed to ypur latest work on libnix and/or the startup code.

bebbo commented 1 year ago

I tested devel1 last night and it seemed to have fixed that particular issue. I noticed way less "read from invalid memory locations" in ROTT. I also recompiled DOpus5 and could not find regressions there... sometimes instructions got a bit reshuffled, but that has no ill effect. Th upshot: the latest toolchain produces slightly smaller (up to a few hundred bytes) output files, likely attributed to your latest work on libnix and/or the startup code.

Thanks for testing. Unfortunately the devel1 changes do not pass the gcc testsuite checks... ... if nothing helps, I'll disable my fancy double indirect addressing stuff...

mheyer32 commented 1 year ago

... if nothing helps, I'll disable my fancy double indirect addressing stuff...

That would be a real pity :-(

bebbo commented 1 year ago

... if nothing helps, I'll disable my fancy double indirect addressing stuff...

That would be a real pity :-(

Some people claim, that double indirect doesn't provide any gain...

mheyer32 commented 1 year ago

It looks like all the issue I hit before are gone now. I'll close this issue. How confident are you in the double-indirect reload code now? Is there a fbbb option to disable it (in case I run into another issue and quickly want o triage)?

bebbo commented 1 year ago

It looks like all the issue I hit before are gone now. I'll close this issue. How confident are you in the double-indirect reload code now? Is there a fbbb option to disable it (in case I run into another issue and quickly want o triage)?

I don't run life critical applications in WinUAE^^ ... ... but I could add a '-fdouble-indirect' option which would be disable using '-fno-double-indirect' (or similar...)

bebbo commented 1 year ago

btw: the new default branch for gcc is now amiga6. I'll update the old gcc-6-branch for a while, but sometimes I forget that^^

mheyer32 commented 1 year ago

btw: the new default branch for gcc is now amiga6. I'll update the old gcc-6-branch for a while, but sometimes I forget that^^

Is this reflected in the make update script? Or would that require a clean new checkout of the toolchain?

bebbo commented 1 year ago
make branch branch=<branch> mod=<module>        switch the module to the given branch

=>

make branch branch=amiga6 mod=gcc
mheyer32 commented 1 year ago

Awesome!

mheyer32 commented 1 year ago

So, I did switch to the amiga6 branch with the commandline above. That seems to affect not only the GCC project, but also others(?). For instance, seems, only now it actually switched me over to the OS3.2 NDK... I then recompiled the toolchain and Dopus5 and compared old and new disassembly.

Some observations:

  1. The compiler now takes better advantage of SHORT parameters. Prior it would often treat them as if they were 32bits parameters with 16bit contents, leading to what looks like a lot of unnecessary clr.l calls.
  2. The compiler is now more hesitant to use double-indirect addressing - I don't know if thats bad or good and why the compiler makes that decision - maybe in this case because DOpus is compiler via -Os?
  3. Overall, DOpus5 is again a little bit smaller... that is always a good thing.
mheyer32 commented 1 year ago

An example excerpt showing 1) and 2) image

bebbo commented 1 year ago

only now it actually switched me over to the OS3.2 NDK...

That's a different change I made in the last days since the web site for NDK3.9 is down. That's a bad experience for new users. And NDK3.9 seems to be ok. Some programs will need changes, due to parameter type fixes, e.g. FindToolType in icon.library.

You can still (try to) use NDK3.9 by adding NDK=3.9 to the make parameters.

bebbo commented 1 year ago

And since you like optimizations, try adding -D__CONSTLIBBASEDECL__=const to avoid a6 reloads. Only the files where libraries are loaded must not use this. If all is done via libnix`s auto-open feature, then you are fine.

mheyer32 commented 1 year ago

Another way to work around the base pointer reload is to declare a local variable of the same name as shown here: https://github.com/mheyer32/camd_tools/blob/main/musplay.c#L38

bebbo commented 1 year ago

You may try out -fno-double-indirect