32-bit alignment for ints

mikrosk commented 1 year ago

This is a follow-up of https://github.com/freemint/m68k-atari-mint-gcc/issues/18#issuecomment-1690584149 ... as we have seen, some projects may depend on this feature, like https://github.com/dxx-rebirth/dxx-rebirth/issues/724.

Accidentally, just a few days ago a totally identical problem has appeared in m68k-elf-gcc: https://lists.debian.org/debian-68k/2023/08/msg00000.html.

There are some arguments for and against, I don't know. IMHO pros outweigh cons. If agreed, this would be an ideal candidate for ABI change for the new elf toolchain.

th-otto commented 1 year ago

I don't think that this would be a good idea. I've already shown that Vincent's change from STRUCTURE_SIZE_BOUNDARY to PCC_BITFIELDTYPE_MATTERS introduces unforseeable ABI changes (and i'm still suggesting to revert that). Using a compiler that uses one setting, and a mintlib that was compiled by the other, breaks without any obvious reason. I'm quite sure same would happen to EmuTOS and the mint kernel.

Also, the suggestion on the debian ML to introduce a new tuple for that type of platform is IMHO totally nonsense. For example, Neither linux nor windows targets need a different tuple only because they changed time_t from 32 to 64bit. Doing that would generate myriads of platform tuples for every change.

vinriviere commented 1 year ago

About all those structural gcc settings (alignment, ABI...), we should really:

list the different parameters (there aren't much)
see how they are set on various m68k targets (preferably alive: embedded, Linux...)
determine what is really important for our target (i.e. who returns structs by value in the real world?)
make choices for a good reason, for present and future.

And I completely agree: the mintelf toolchain is an ideal opportunity to change our GCC settings. To something better, as we decide.

BTW:

we could have different toolchains with different settings (if that matters). Anyone is free to configure GCC for his own needs.
maybe we could add new command-line arguments to dynamically change the behaviour
but generally, simpler is better.

AAARGS. Sorry, i accidently edited your comment instead of mine. Tried to revert it, hopefully got it right.

vinriviere commented 1 year ago

About EmuTOS:

it's important to return pointers to d0 (not a0) because it's what assembler code expects.
variables alignment on 2 produces more compact code, which is a good thing for 68000
default alignment on 4 shouldn't hurt, except additional memory usage
current code compiles and work well with both m68k-atari-mint and m68k-elf, so those settings don't seem to be much important for that project. But as EmuTOS is self-contained, that's certainly a special case.

th-otto commented 1 year ago

who returns structs by value in the real world?

Most c++ code does that.

Anyone is free to configure GCC for his own needs.

Theoretically, yes. But there aren't that many people that would be able to do that, or understand the consequences. And of course you would be on your own when you need 3rd party libraries. We wouldn't do ppl any favor if we tell them: "here are the sources, here are the patches, and here is a sample script that works in my environment", and then expecting them to built their own toolchain. We shouldn't divide our small community even more.

th-otto commented 1 year ago

it's important to return pointers to d0 (not a0) because it's what assembler code expects.

Actually, that is the default in m68k.h: https://github.com/freemint/m68k-atari-mint-gcc/blob/b2d961e7342b5ba4e57adfa81cb189b738d10901/gcc/config/m68k/m68k.h#L502-L503

but changed for linux/svr4 abi:

https://github.com/freemint/m68k-atari-mint-gcc/blob/b2d961e7342b5ba4e57adfa81cb189b738d10901/gcc/config/m68k/linux.h#L156-L157

Furthermore, additional code is generated in the callee that actually returns pointer values in both registers:

https://github.com/freemint/m68k-atari-mint-gcc/blob/b2d961e7342b5ba4e57adfa81cb189b738d10901/gcc/config/m68k/m68k.c#L5311-L5326

vinriviere commented 1 year ago

Most c++ code does that.

Hum.. True.

About third-party libraries: well, I never used any. Each time I needed one (i.e. readline, mpfr...) I compiled it and I shipped it with my cross-tools binaries. Main idea was to provide standalone cross-tools, so anyone can easily build their own software, without caring about libraries. So whatever settings are used, I don't really mind, as I always rebuild all the libs when GCC significantly changes. And that's the same for any user of my binaries, if they don't use extra libraries in binary form.

Do really people take libraries in binary form from somewhere and use them on different toolchain? I don't know. But to avoid trouble, I wouldn't recommend that.

Originally, I rebuilt all the libs because I wanted to build everything from sources, and specifically with GCC 4 (which was brand new, at that time). Then I wanted to build and provide ColdFire-only binaries, to be usable with EmuTOS for ColdFire.

I don't have definite answers. I can only remember that some choices have important consequences. Some don't.

vinriviere commented 1 year ago

it's important to return pointers to d0 (not a0) because it's what assembler code expects.

Actually, that is the default in m68k.h:

Yes. Initially, on GCC 4, I used that Linux ABI to return pointers. Then I quickly reverted that, because it was incompatible with EmuTOS assembler files. The register used to return pointers is an important parameter, because pointers are often returned in manually-written assembler sources, and the C/ASM interface must match.

About other parameters... well, we have to make a list to clearly see where they are really important.

th-otto commented 1 year ago

I'm quite sure that people that use our toolchains don't rebuild libraries, not even mintlib or gemlib. They are just happy to get a working environment.

As to the consequences about ABI changes: i do understand that some of your changes (like leading underscores or not) were done on purpose, and others (like using m68kemb.h and getting a different default ABI) aren't. And arguable, some ABI changes (like returning float values in FPU registers) may actually give a few performance gains. Question is whether that really outweights the problems that go with it.

mikrosk commented 1 year ago

Using a compiler that uses one setting, and a mintlib that was compiled by the other, breaks without any obvious reason. I'm quite sure same would happen to EmuTOS and the mint kernel.

That's why mintelf is so attractive now. We will (and must) state that that you simply can't mix old and new toolchain. So this danger is gone, <prefix> is different, too so this shouldn't happen even by accident.

I agree that reverting back to fp0 / a0 would be perhaps a bad idea (plus returning stuff always in d0 does have user advantage) but about the rest... I wouldn't care so much. As I mentioned earlier, for me a smaller patch = better patch because we don't depend on someone else (like Thorsten or Vincent) to know all the compiler magic if something changes upstream.

So far I'm aware of:

struct alignment affecting std:array and gdb (fixed in mintelf, but with Thorsten being suspicious about that ;-))
struct alignment of 32-bit ints (so far still being at 2 bytes, upstream at least m68k-linux seems to be doing the same but breaking some not-so-forgiving real world applications)
leading underscores (kept as in old a.out, differs from upstream)
--with-default-libstdcxx-abi=gcc4-compatible .. according to https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html it doesn't seem to be necesarry, esp. for a new toolchain... I thought I have been using it, too but nope, I have not.

th-otto commented 1 year ago

So this danger is gone, <prefix> is different, too so this shouldn't happen even by accident.

But you don't understand the consequences. Changes in structure layout are of course a concern. Some of them are passed as-is to the kernel. Do you really want to go through all of it and check where that might be a problem? Same will happen inside the kernel, when calling driver functions etc.

I wouldn't care so much

But why for example changing structure return register to a0 instead of a1? That does not give any performance gain, and introduces an incompatibility for no good reason (yes, i know, that was not vincents intention, but it has to be reverted).

* struct alignment affecting std:array and gdb (fixed in mintelf, but with Thorsten being suspicious about that ;-))

That would be the only argument, that we may encounter code that does not compile because of this. But it can easily be fixed in gdb (i already wrote that the patch now handles both cases). And i still think that code that assumes alignof(sometype) == sizeof(sometype) is broken. They are two different things.

And better get compile time errors, pointing you directly to the problem, than hunting down such alignment mismatches at runtime. Those will be hard to find, even using gdb ;)

upstream at least m68k-linux seems to be doing the same

... but at the cost that new libraries will only work with newer kernels, and vice versa. Something that we simply cannot do.

* leading underscores (kept as in old a.out, differs from upstream)

Yes but that is an API change, not an ABI change. But i agree that keeping the underscores is something that helps switching to that newer toolchain. Just means i'll have to recompile a few hundred libs sigh

* `--with-default-libstdcxx-abi=gcc4-compatible` ..

To be honest, i'm not entirely sure what will be affected by this. But this will only affect c++ code, and only the standard c++ library; since that is already compiler-version dependant, i don't see much problems with this, and already changed that in my scripts, too.

mikrosk commented 1 year ago

Some of them are passed as-is to the kernel.

Do we really have 32-bit code passed to kernel with expectation of being 16-bit aligned? AFAIK everything kernel-related is compiled with -mshort and that will be always 16-bit aligned.

But why for example changing structure return register to a0 instead of a1

Indeed, this smells but on the other hand, I refuse to accept that the only solution is to change alignment. Those looks like totally separate things to me, even if the old state can be achieved with the original alignment. So yes, let's bring back a0 but not via messing with alignment rules.

code that assumes alignof(sometype) == sizeof(sometype) is broken

The assert was to ensure binary compatibility with some ancient structs. So while the 16-bit alignment sounds like something which could be handled on the app's side (with padding), this one hardly, sizeof(std:array) is suddenly 'wrong' (padded with unexpected zeroes) and it is the compiler to blame, not the app.

th-otto commented 1 year ago

Think of structures like: struct { short a; long b; };

Currently, those will have the same layout, regardless whether compiled with mshort or not. With the new compiler, they will be different. Even worse, they will also be different in the kernel, since "b" will be aligned at 32bit, even with -mshort.

and it is the compiler to blame, not the app.

No, that's definitely wrong. Padding is always implementation dependent. Think of long double on x64, where the format is the same as for m68k (except endianness of course), but sizeof(long double) is 16.

The only portable way to define structures that require a specific memory layout is to use only arrays of chars.

vinriviere commented 1 year ago

I'm quite sure that people that use our toolchains don't rebuild libraries, not even mintlib or gemlib. They are just happy to get a working environment.

Exactly. We agree.

As to the consequences about ABI changes: i do understand that some of your changes (like leading underscores or not) were done on purpose, and others (like using m68kemb.h and getting a different default ABI) aren't. And arguable, some ABI changes (like returning float values in FPU registers) may actually give a few performance gains. Question is whether that really outweights the problems that go with it.

Again, well summarized.

To clarify, here is how we arrived to the current situation with the new mintelf toolchain:

I wanted to https://github.com/freemint/m68k-atari-mint-binutils-gdb/issues/1. First, because a.out is now obsolete in official binutils/gcc. And second, because I'm not comfortable with distributing the obsolete GCC 4.6.4 on my website. My initial goal was to provide up-to-date tools, not obsolete stuff.
I was sure that it was possible to easily design a new hybrid PRG/ELF format. Basically, by adding a PRG header and a relocation table to an existing ELF executable. And fixing the offsets accordingly.
So I used the m68k-elf toolchain to build EMUCON2.ELF, and I wrote a small program to convert it to PRG/ELF. It worked very quickly, so that was encouraging.
Then I patched binutils's ld in order to produce the very same bytes. This was much longer. I struggled to convince the ELF linker code to move the ELF File Header a little farther, but it finally worked. So I was happy when ld finally produced a PRG/ELF byte-to-byte identical to the output of my temporary program.
Then I took Thorsten's GCC 13 patches, and got an immediately working GCC 13 for my new mintelf binutils. Thanks again @th-otto for that, because this showed me immediately that it was possible to continue that way.
As I wanted a GCC patch as minimal as possible, I stripped most of Thorsten's evolutions (mainly, MinGW support), kept essential changes (mainly libstdc++), and configured the core settings just like m68k-elf. Again, I wanted to get exact byte-to-byte match to my initial testcase. So configuring the new target as m68k-elf was mandatory for that step.
After that I added that targ_underscore setting in order to compile any software (starting from EmuTOS) without a single modification. I did compile FreeMiNT that way, even if at that time I didn't test it because I hadn't the required setup with me.

So what? This ended up to the current experimental situation of the new m68k-atari-mintelf toolchain. At some time, this had to become public. So I did. Next step was gdb, and together we have had great success.

Now that PRG/ELF is debugged and well understood, we have to choose the settings we want for official configuration. That will be a choice, with good reasons to do so.

Again, as long as people will be able to build their favorite software with the new toolchain, without having to massively update their sources (i.e. leading underscore or register prefix), I will be happy.

If the FreeMiNT kernel relies on some alignment or ABI or whatever, specially between -mshort and -mlong programs, then that's a poor and fragile API. Anyway, if that exists, that must be fixed, or at least documented. We can immediately start a wiki page for that.

mikrosk commented 1 year ago

"b" will be aligned at 32bit, even with -mshort

If this is how it works and it is unavoidable then I agree, 32-bit alignment is no way for us. I was under impression that you can have 16-bit alignments with -mshort.

@vinriviere well summarised, no arguments from me there.

th-otto commented 1 year ago

If this is how it works and it is unavoidable then I agree, 32-bit alignment is no way for us

Yes, it does. I think your confusing two different things here, alignment of int and structure padding. Alignment of int can already be forced by the existing compiler, and the discussion on debian ML is only whether that should be the default:

$ cat foo.c
#include <stddef.h>

struct s {
        short a;
        long b;
};

long o = offsetof(struct s, b);

$ m68k-atari-mint-gcc -S -o -   ./foo.c | grep long
        .long   2
$ m68k-atari-mint-gcc -S -o - -malign-int  ./foo.c | grep long
        .long   4
$ m68k-atari-mint-gcc -S -o -  -mshort ./foo.c | grep long
        .long  2
$ m68k-atari-mint-gcc -S -o - -malign-int -mshort ./foo.c | grep long
        .long  4

vinriviere commented 1 year ago

Think of structures like: struct { short a; long b; };

So, what's the point? Do we need a "short" padding between a and b?

As a 68000 owner, I will say: no! That would be a waste of space. And even the PRG header 0x601a + sizeof_text would not be possible with such padding. Unless the evil pragma pack is used. Or arrays of chars, as Thorsten (and binutils) proposed. Also, at least GEM structures are full of short and long mixtures like that. That would probably cause much trouble to add padding that way.

As a ColdFire owner, I would say: yes, do that! Because that's faster. But that's a nonsense. If that struct was designed for speed, it would not use a short before a long. New code caring about speed in FastRAM should not align long members on non-multiple of 4.

What matters: give the programmer the ability to align longs on multiple of 4 bytes, if he wants. But don't force that.

In the current experimental m68k-atari-mintelf toolchain, I carefully checked that .align 4 does the right thing. And this is honored by TOS 4 and FreeMiNT (not TOS 1-2-3). For EmuTOS it depends (default CPU?).

So regarding to the example above, I would be in favor to keep the structure short/long packed. If someone wants to align the long, then manually add a short filler just before. Of course I will be happy to change my mind if someone provides a good argument to do the contrary (align longs on 32-bit).

And once again, if that really matters, the best solution would probably be to add a nonstandard option to change that alignment setting when compiling (rather than at gcc configure time).

vinriviere commented 1 year ago

Now I'm also confused. I completely gorgot about -malign-int.

Looking at @th-otto's latest post, my opinion is:

Use -mno-align-int by default (align long on 16-bit only)
But the developer is free to use -malign-int instead where appropriate (align long on 32-bit)

th-otto commented 1 year ago

That would probably cause much trouble to add padding that way.

Not only that. You can always add padding if you are concerned about speed, but to remove compiler-inserted padding, you would have to declare all such structures as packed. And then expect a lot of warnings that you are doing unaligned accesses to packed structures.

And once again, if that really matters, the best solution would probably be to add a nonstandard option to change that alignment setting when compiling (rather than at gcc configure time).

But such an option already exists: -malign-int. IIRC there is even an some warning option, when using that produces a different layout than the compilers default.

So conclusion: making that a default would even cause much more trouble than the std::array issue.

I think the only motivation for the debian folks is to more closely match what is used on other platforms. But that is of not much concern to us (unless someone wants to backport emutos & mint to x86)

mikrosk commented 1 year ago

I have also forgotten about -malign-int , I'm totally fine with that solution.

I'll test and report back, maybe there's something else related to aligning worth discussing.

th-otto commented 1 year ago

But don't forget that you can't use that option with mintlib, or with any other library that you don't compile yourself in the project.

vinriviere commented 1 year ago

There's a lot of noise here, so let's make a summary about the initial issue: 32-bit alignment for ints. Most of the information here has already been said.

@th-otto has provided a nice testcase: https://github.com/freemint/m68k-atari-mint-gcc/issues/20#issuecomment-1695767920

struct s {
        short a;
        long b;
}

We can see that using -mshort or not has no effect. This option is unrelated.

Question is: Should gcc add a "short" padding between a and b, so "long b" is a aligned on a 4-byte boundary?

Nowadays, for modern processors, "natural" alignment is to align types on a multiple of their sizes. For example, on x86, if I compile the testcase with gcc -m32 I get .long 4 (so: padding). While traditionally, m68k compilers don't add any padding in that case. Because long alignment on even addresses is enough with a 16-bit bus. No need for 32-bit alignement. However it matters with a 32-bit bus (i.e. 68060 or ColdFire) in TT-RAM.

So when no care is taken to order or pad struct contents, m68k structs are generally shorter than x86 structs. This is what happens in https://github.com/dxx-rebirth/dxx-rebirth/issues/724. Even if m68k is different that x86 in this regard, that's perfectly legal, regarding to C standard. Such struct differences are unimportant in memory. However they are critical when writing structs to files, in order to read them elsewhere. So relying on some particular layout for structs, like dxx-rebirth, is a program flaw. If padding is important, then it must be manually added with char fillers. Then the result must be checked with asserts, as it is already done.

Inside gcc, this alignment is defined by the BIGGEST_ALIGNMENT parameter: https://github.com/freemint/m68k-atari-mint-gcc/blob/f2ad8b025485fa3703447ed0d0c7e27d406df338/gcc/config/m68k/m68k.h#L301-L305

By default, it is 16-bit (no filler), but if -malign-int is used, it becomes 32-bit (short filler).

I think that it would be problematic to change that default. If we set BIGGEST_ALIGNMENT to 32 unconditionally, we will have the same behaviour as x86. But I guess that would be a headache for many TOS/GEM structures mixing shorts and longs, they would become impossible to represent with gcc structs. On the other hand, as already said, it's always possible to manually do the opposite: add extra padding with explicit fillers.

So my advice is to keep BIGGEST_ALIGNMENT as is, set to 16 by default.

I have also forgotten about -malign-int , I'm totally fine with that solution.

Fine. So is it OK to close this issue, without further action?

I'll test and report back, maybe there's something else related to aligning worth discussing.

In order to keep things tidy, please open other issues for problems unrelated to field alignment inside structs.

th-otto commented 1 year ago

Well summarized. And i think too, this can be closed.

mikrosk commented 1 year ago

But don't forget that you can't use that option with mintlib, or with any other library that you don't compile yourself in the project.

Yes, that's a good reminder. It somewhat limits its usage, or at least complicates it a bit.

Anyway. As a proof of concept I wanted to give it a try. Without -malign-int:

[mikro@pc dxx-rebirth.git]$ m68k-atari-mintelf-g++ -c -g -O2 -ftabstop=4 -Wall -Werror=extra -Werror=format=2 -Werror=missing-braces -Werror=missing-include-dirs -Werror=uninitialized -Werror=undef -Werror=pointer-arith -Werror=cast-qual -Werror=missing-declarations -Werror=vla -m68020-60 -fomit-frame-pointer -D__STDC_LIMIT_MACROS -Wno-error=format-truncation -funsigned-char -std=gnu++20 -Werror=unused -Werror=useless-cast -Wimplicit-fallthrough=5 -Wno-dangling-reference -fvisibility=hidden -Wduplicated-branches -Wduplicated-cond -Wsuggest-attribute=noreturn -Wsuggest-final-types -Wsuggest-override -Wlogical-op -Wold-style-cast -Wredundant-decls -Wno-sign-compare -DPHYSFS_DEPRECATED= -DDXX_USE_SHAREPATH=1 -DNDEBUG -DRELEASE -DDXX_BUILD_DESCENT_II -D__STDC_FORMAT_MACROS -Icommon/include -Icommon/main -I. -Ibuild -I/home/mikro/gnu-tools/m68000/m68k-atari-mintelf/sys-root/usr/include -I/home/mikro/gnu-tools/m68000/m68k-atari-mintelf/sys-root/usr/include/SDL -Id2x-rebirth/main similar/main/scores.cpp -o build/similar/main/.d2x-rebirth.scores.o
similar/main/scores.cpp:110:34: error: static assertion failed: high score size wrong
  110 | static_assert(sizeof(all_scores) == 336, "high score size wrong");
      |               ~~~~~~~~~~~~~~~~~~~^~~~~~
similar/main/scores.cpp:110:34: note: the comparison reduces to '(314 == 336)'

with -malign-int:

[mikro@pc dxx-rebirth.git]$ m68k-atari-mintelf-g++ -c -g -O2 -ftabstop=4 -Wall -Werror=extra -Werror=format=2 -Werror=missing-braces -Werror=missing-include-dirs -Werror=uninitialized -Werror=undef -Werror=pointer-arith -Werror=cast-qual -Werror=missing-declarations -Werror=vla -m68020-60 -fomit-frame-pointer -malign-int -D__STDC_LIMIT_MACROS -Wno-error=format-truncation -funsigned-char -std=gnu++20 -Werror=unused -Werror=useless-cast -Wimplicit-fallthrough=5 -Wno-dangling-reference -fvisibility=hidden -Wduplicated-branches -Wduplicated-cond -Wsuggest-attribute=noreturn -Wsuggest-final-types -Wsuggest-override -Wlogical-op -Wold-style-cast -Wredundant-decls -Wno-sign-compare -DPHYSFS_DEPRECATED= -DDXX_USE_SHAREPATH=1 -DNDEBUG -DRELEASE -DDXX_BUILD_DESCENT_II -D__STDC_FORMAT_MACROS -Icommon/include -Icommon/main -I. -Ibuild -I/home/mikro/gnu-tools/m68000/m68k-atari-mintelf/sys-root/usr/include -I/home/mikro/gnu-tools/m68000/m68k-atari-mintelf/sys-root/usr/include/SDL -Id2x-rebirth/main similar/main/scores.cpp -o build/similar/main/.d2x-rebirth.scores.o
during RTL pass: expand
similar/main/scores.cpp: In function 'void d2x::{anonymous}::scores_read(all_scores*)':
similar/main/scores.cpp:159:1: internal compiler error: Segmentation fault
  159 | }
      | ^
0x160b1c7 internal_error(char const*, ...)
        ???:0
0xa67312 assign_temp(tree_node*, int, int)
        ???:0
0xa21b78 emit_push_insn(rtx_def*, machine_mode, tree_node*, rtx_def*, unsigned int, int, rtx_def*, poly_int<1u, long>, rtx_def*, rtx_def*, int, rtx_def*, bool)
        ???:0
0x8fbc59 emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type, machine_mode, int, std::pair<rtx_def*, machine_mode>*)
        ???:0
0x9f1341 finish_eh_generation()
        ???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://github.com/freemint/m68k-atari-mint-gcc/issues> for instructions.

I'm not so keen to close this as solved. ;-)

vinriviere commented 1 year ago

Good, you have a nice testcase to open a bug on GCC Bugzilla.

mikrosk commented 1 year ago

... except that it compiles fine on m68k-elf-g++ 13.2.0.

Attached preprocessed cpp file, just to be sure I'm not compiling mine incorrectly.

Compile as

m68k-elf-g++ -c -g -O2 -ftabstop=4 -Wall -Werror=extra -Werror=format=2 -Werror=missing-braces -Werror=missing-include-dirs -Werror=uninitialized -Werror=undef -Werror=pointer-arith -Werror=cast-qual -Werror=missing-declarations -Werror=vla -m68020-60 -fomit-frame-pointer -malign-int -Wno-error=format-truncation -funsigned-char -std=gnu++20 -Werror=unused -Werror=useless-cast -Wimplicit-fallthrough=5 -Wno-dangling-reference -fvisibility=hidden -Wduplicated-branches -Wduplicated-cond -Wsuggest-attribute=noreturn -Wsuggest-final-types -Wsuggest-override -Wlogical-op -Wold-style-cast -Wredundant-decls -Wno-sign-compare d2x-rebirth.scores.cpp -o d2x-rebirth.scores.o

(replace m68k-elf-g++ with m68k-atari-mintelf-g++ to see the crash)

d2x-rebirth.scores.cpp.zip

th-otto commented 1 year ago

Can you maybe try to reduce it to a smaller testcase that still gives the ICE?

vinriviere commented 1 year ago

I've just had a quick look.

The command line can be reduced to: m68k-atari-mintelf-g++ -c -malign-int -std=gnu++20 d2x-rebirth.scores.cpp

The line causing the ICE is inside assign_builtin_placeholder_scores(): for (auto &&[idx, stat] : enumerate(scores.stats))

I'm not familiar with that strange syntax, and I don't plan to investigate more. This looks like a specific case appearing when using -malign-int and specific C++ features. As this doesn't seem to be a general issue with -malign-int, I would propose to open a new issue for that specific case.

th-otto commented 1 year ago

I don't care about much about the length of the command line, more about the ~3MB source file ;)

th-otto commented 1 year ago

Hm, i don't get that ICE. But some static assertions (with or without -malign-int):

In file included from common/main/inferno.h:32,
                 from common/main/polyobj.h:36,
                 from common/main/fwd-object.h:19,
                 from common/main/fwd-game.h:25,
                 from common/main/game.h:28,
                 from similar/main/scores.cpp:40:
common/main/player-callsign.h:80:34: error: static assertion failed: callsign_t too big
common/main/player-callsign.h:80:34: note: the comparison reduces to '(10 == 9)'
In file included from similar/main/scores.cpp:45:
common/main/player.h:173:33: error: static assertion failed: wrong size player_rw
common/main/player.h:173:33: note: the comparison reduces to '(143 == 142)'

mikrosk commented 1 year ago

@th-otto it seems that you don't have that std::array fix applied (sorry can't remember it's exact name now). What makes actually an interesting point, it seems that -malign-int interferes with that somehow so it crashes in mine (where applied) and doesn't on yours (where isn't).

Those asserts are the very reason why I find that fix so helpful for me.

th-otto commented 1 year ago

Yes, i currently use the old setting again (STRUCTURE_SIZE_BOUNDARY = 16)

it crashes in mine (where applied) and doesn't on yours (where isn't).

A good indication why we should not change that setting.

mikrosk commented 1 year ago

A good indication why we should not change that setting.

Well, that setting is active for the m68k-elf target, too, isn't it? So to me it looks like a good test case to verify that our other settings are valid / supported.

mikrosk commented 1 year ago

Ok, I've stripped it down to ~200 lines of code. The offending enumerate is a custom class which is now included in the cpp file. Compile as m68k-atari-mintelf-g++ -c -malign-int -std=gnu++20 scores.cpp -o scores.o:

#include <algorithm>
#include <ranges>
#include <iterator>
#include <tuple>
#include <type_traits>

namespace ranges {

using std::ranges::range;
using std::ranges::borrowed_range;
using std::ranges::subrange;
using std::ranges::find;
using std::ranges::find_if;

}

namespace d_enumerate {

namespace detail {

/* In the general case, `value_type` is a tuple of the index and the
 * result of the underlying sequence.
 *
 * In the special case that `result_type` is a tuple, construct a
 * modified tuple instead of using a tuple of (index, inner tuple).
 * This allows simpler structured bindings.
 */
template <typename index_type, typename result_type>
struct adjust_iterator_dereference_type : std::false_type
{
    using value_type = std::tuple<index_type, result_type>;
};

template <typename index_type, typename... T>
struct adjust_iterator_dereference_type<index_type, std::tuple<T...>> : std::true_type
{
    using value_type = std::tuple<index_type, T...>;
};

/* Retrieve the member typedef `index_type` if one exists.  Otherwise,
 * use `std::size_t` as a fallback.
 */
std::size_t array_index_type(...);

/* Add, then remove, a reference to the `index_type`.  If `index_type`
 * is an integer or enum type, this produces `index_type` again.  If
 * `index_type` is `void`, then this overload refers to
 * `std::remove_reference<void &>::type`, which is ill-formed, and the
 * overload is removed by SFINAE.  This allows target types to use
 * `using index_type = void` to decline to specify an `index_type`,
 * which may be convenient in templates that conditionally define an
 * `index_type` based on their template parameters.
 */
template <typename T>
typename std::remove_reference<typename T::index_type &>::type array_index_type(T *);

}

}

template <typename sentinel_type>
class enumerated_sentinel
{
public:
    sentinel_type m_sentinel;
    constexpr enumerated_sentinel() = default;
    constexpr enumerated_sentinel(sentinel_type &&iter) :
        m_sentinel(std::move(iter))
    {
    }
};

template <typename range_index_type, typename range_iterator_type, typename sentinel_type, typename adjust_iterator_dereference_type>
class enumerated_iterator
{
    range_iterator_type m_iter;
    range_index_type m_idx;
public:
    using index_type = range_index_type;
    using iterator_category = std::forward_iterator_tag;
    using value_type = typename adjust_iterator_dereference_type::value_type;
    using difference_type = std::ptrdiff_t;
    using pointer = value_type *;
    using reference = value_type &;
    constexpr enumerated_iterator(range_iterator_type &&iter, const index_type idx) :
        m_iter(std::move(iter)), m_idx(idx)
    {
    }
    value_type operator*() const
    {
        if constexpr (adjust_iterator_dereference_type::value)
            return std::tuple_cat(std::tuple<index_type>(m_idx), *m_iter);
        else
            return {m_idx, *m_iter};
    }
    enumerated_iterator &operator++()
    {
        ++ m_iter;
        if constexpr (std::is_enum<index_type>::value)
            /* An enum type is not required to have operator++()
             * defined, and may deliberately omit it to avoid allowing
             * the value to be incremented in the typical use case.  For
             * this situation, an increment is desired, so emulate
             * operator++() by casting to the underlying integer
             * type, adding 1, and casting back.
             */
            m_idx = static_cast<index_type>(1u + static_cast<typename std::underlying_type<index_type>::type>(m_idx));
        else
            ++ m_idx;
        return *this;
    }
    enumerated_iterator operator++(int)
    {
        auto result = *this;
        ++ * this;
        return result;
    }
    constexpr bool operator==(const enumerated_sentinel<sentinel_type> &i) const
    {
        return m_iter == i.m_sentinel;
    }
};

template <typename range_iterator_type, typename range_sentinel_type, typename range_index_type>
class enumerate : ranges::subrange<range_iterator_type, range_sentinel_type>
{
    using base_type = ranges::subrange<range_iterator_type, range_sentinel_type>;
    using iterator_dereference_type = decltype(*std::declval<range_iterator_type>());
    using enumerated_iterator_type = enumerated_iterator<
        range_index_type,
        range_iterator_type,
        range_sentinel_type,
        d_enumerate::detail::adjust_iterator_dereference_type<range_index_type, typename std::remove_cv<iterator_dereference_type>::type>>;
    const range_index_type m_idx;
public:
    using index_type = range_index_type;
    template <typename range_type>
        /* Block using `enumerate` on an ephemeral range, since the storage
         * owned by the range must exist until the `enumerate` object is
         * fully consumed.  If `range_type &&` is an ephemeral range, then its
         * storage may cease to exist after this constructor returns.
         */
        requires(ranges::borrowed_range<range_type>)
        enumerate(range_type &&t, const index_type i = index_type{}) :
            base_type(std::forward<range_type>(t)), m_idx(i)
    {
        static_assert(std::is_rvalue_reference<range_type &&>::value || !std::is_rvalue_reference<iterator_dereference_type>::value, "lvalue range must not produce rvalue reference enumerated_value");
    }
    enumerated_iterator_type begin() const
    {
        return {this->base_type::begin(), m_idx};
    }
    enumerated_sentinel<range_sentinel_type> end() const
    {
        return {this->base_type::end()};
    }
};

template <typename range_iterator_type, typename range_sentinel_type, typename index_type>
inline constexpr bool std::ranges::enable_borrowed_range<enumerate<range_iterator_type, range_sentinel_type, index_type>> = true;

template <typename range_type, typename index_type = decltype(d_enumerate::detail::array_index_type(static_cast<typename std::remove_reference<range_type>::type *>(nullptr)))>
requires(ranges::borrowed_range<range_type>)
enumerate(range_type &&r, index_type start = {/* value ignored for deduction guide */}) -> enumerate</* range_iterator_type = */ decltype(std::ranges::begin(std::declval<range_type &>())), /* range_sentinel_type = */ decltype(std::ranges::end(std::declval<range_type &>())), index_type>;

#define COOL_MESSAGE_LEN    50
constexpr std::integral_constant<unsigned, 10> MAX_HIGH_SCORES{};

struct stats_info
{
    int     score;
    char   starting_level;
    char   ending_level;
    char   diff_level;
    short   kill_ratio;     // 0-100
    short   hostage_ratio;  // 
    int     seconds;        // How long it took in seconds...
};

struct all_scores
{
    char            signature[3];           // DHS
    char           version;             // version
    char            cool_saying[COOL_MESSAGE_LEN];
    stats_info  stats[MAX_HIGH_SCORES];
};

static void assign_builtin_placeholder_scores(all_scores &scores)
{
    for (auto &&[idx, stat] : enumerate(scores.stats))
        stat.score = (10 - idx) * 1000;
}

Still applies that m68k-elf compiles this just fine.

th-otto commented 1 year ago

If m68k-elf compiles, my toolchain (with the old setting) compiles, but vincents toolchain gives ICE, what does it tell you?

Edit: i don't get an ICE with the above sample, neither with vincents nor with my toolchain

mikrosk commented 1 year ago

Thorsten, you're still missing the point: mint-elf-gcc compiles is just fine and this gcc has STRUCTURE_SIZE_BOUNDARY set 16.

So arguing that your toolchain setting is better because it doesn't crash is just wrong. There is something else going on than just (not) setting this variable to 16.

EDIT: @vinriviere was able to crash it so I'd argue that your "Vincent's" toolchain is not compiled properly on your side. Or maybe actually is but then you have to tell us your secret. ;)

th-otto commented 1 year ago

mint-elf-gcc compiles is just fine and this gcc has STRUCTURE_SIZE_BOUNDARY set 16.

What makes you think so? It does not define that at all, having the same effect as defining it to 8.

th-otto commented 1 year ago

Or maybe actually is but then you have to tell us your secret. ;)

I didn't change the sources. That is, i took a git archive from https://github.com/freemint/m68k-atari-mint-gcc/tree/gnu/releases/gcc-13 without applying any further patches in the script, then compiled that on linux. The hosts gcc is gcc-13.2.1 here.

mikrosk commented 1 year ago

Erm, yes, I meant STRUCTURE_SIZE_BOUNDARY set 8, i.e. the default which me and Vincent are in favour to keep while you prefer to set it 16 as in the a.out build.

vinriviere commented 1 year ago

If m68k-elf compiles, my toolchain (with the old setting) compiles, but vincents toolchain gives ICE, what does it tell you?

It means that my experimental toolchain isn't mature enough, and needs to be polished.

vinriviere commented 1 year ago

Ok, I've stripped it down to ~200 lines of code.

About that bug (which would definitely deserve its own issue): I've just built my gcc Ubuntu package with debug symbols to see the backtrace when a crash occurs.

How to use:

vi /etc/apt/sources.list.d/vriviere-ubuntu-mintelf-lunar.list

deb https://ppa.launchpadcontent.net/vriviere/mintelf/ubuntu/ lunar main

Add main/debug at the end of that line, so it looks like that:

deb https://ppa.launchpadcontent.net/vriviere/mintelf/ubuntu/ lunar main main/debug

sudo apt update
sudo apt install gcc-m68k-atari-mintelf-dbgsym

Then:

$ m68k-atari-mintelf-g++ -c -malign-int -std=gnu++20 scores.cpp -o scores.o
during RTL pass: expand
In file included from /usr/m68k-atari-mintelf/include/c++/13.2.0/bits/uses_allocator_args.h:38,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/bits/memory_resource.h:41,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/string:58,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/bits/locale_classes.h:40,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/bits/ios_base.h:41,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/streambuf:43,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/bits/streambuf_iterator.h:35,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/iterator:66,
                 from /usr/m68k-atari-mintelf/include/c++/13.2.0/ranges:43,
                 from scores.cpp:2:
/usr/m68k-atari-mintelf/include/c++/13.2.0/tuple: In constructor 'constexpr std::tuple<_T1, _T2>::tuple(const _T1&, const _T2&) [with bool _Dummy = true; typename std::enable_if<std::_TupleConstraints<_Dummy, _T1, _T2>::__is_implicitly_constructible<const _T1&, const _T2&>(), bool>::type <anonymous> = true; _T1 = long unsigned int; _T2 = stats_info&]':
/usr/m68k-atari-mintelf/include/c++/13.2.0/tuple:1326:36: internal compiler error: Segmentation fault
 1326 |         : _Inherited(__a1, __a2) { }
      |                                    ^
0xd15523 crash_signal
    ../../gcc-13.2.0/gcc/toplev.cc:314
0x7fbd9963c4af ???
    ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xa2d186 assign_temp(tree_node*, int, int)
    ../../gcc-13.2.0/gcc/function.cc:976
0x9e6c8f emit_push_insn(rtx_def*, machine_mode, tree_node*, rtx_def*, unsigned int, int, rtx_def*, poly_int<1u, long>, rtx_def*, rtx_def*, int, rtx_def*, bool)
    ../../gcc-13.2.0/gcc/expr.cc:4925
0x8be619 emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type, machine_mode, int, std::pair<rtx_def*, machine_mode>*)
    ../../gcc-13.2.0/gcc/calls.cc:4571
0x9b117c emit_library_call(rtx_def*, libcall_type, machine_mode, rtx_def*, machine_mode)
    ../../gcc-13.2.0/gcc/rtl.h:4337
0x9b117c sjlj_emit_function_enter
    ../../gcc-13.2.0/gcc/except.cc:1212
0x9b66a1 sjlj_build_landing_pads
    ../../gcc-13.2.0/gcc/except.cc:1491
0x9b66a1 finish_eh_generation()
    ../../gcc-13.2.0/gcc/except.cc:1520
0x8d73fc execute
    ../../gcc-13.2.0/gcc/cfgexpand.cc:6950
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://github.com/freemint/m68k-atari-mint-gcc/issues> for instructions.

vinriviere commented 1 year ago

The bug may or may not be similar to that one. No idea. [9 Regression] internal compiler error: in assign_temp, at function.c:982

vinriviere commented 1 year ago

A more reduced testcase:

#include <tuple>

void f()
{
    int a, b;
    std::tie(a, b);
}

th-otto commented 1 year ago

Ok, did some retesting. Maybe i did something wrong last time, but in certain cases i also get ICE with my toolchain. So to summarize, i did the tests again with the large file (d2x-large.cpp), the reduced example (d2x-small.cpp) and the the minimal example from above (tuple.cpp). And here is what i get, First Vincents toolchain (configured without STRUCTURE_SIZE_BOUNDARY, and PCC_BITFIELD_TYPE_MATTERS)

$ /opt/cross-mintelf/bin/m68k-atari-mintelf-g++ -std=gnu++20  -c d2x-large.cpp 
similar/main/scores.cpp:110:34: error: static assertion failed: high score size wrong
similar/main/scores.cpp:110:34: note: the comparison reduces to '(314 == 336)'

$ /opt/cross-mintelf/bin/m68k-atari-mintelf-g++ -std=gnu++20 -malign-int  -c d2x-large.cpp 
during RTL pass: expand
common/main/gamefont.h: In member function 'float base_font_scale_proportion::operator*(float) const':
common/main/gamefont.h:75:14: internal compiler error: Segmentation fault

$ /opt/cross-mintelf/bin/m68k-atari-mintelf-g++ -std=gnu++20  -c d2x-small.cpp
(no errors, no crash)

$ /opt/cross-mintelf/bin/m68k-atari-mintelf-g++ -std=gnu++20 -malign-int -c d2x-small.cpp 
during RTL pass: expand
/opt/cross-mintelf/m68k-atari-mintelf/include/c++/13.2.0/tuple: In constructor 'constexpr std::tuple<_T1, _T2>::tuple(const _T1&, const _T2&) [with bool _Dummy = true; typename std::enable_if<std::_TupleConstraints<_Dummy, _T1, _T2>::__is_implicitly_constructible<const _T1&, const _T2&>(), bool>::type <anonymous> = true; _T1 = long unsigned int; _T2 = stats_info&]':
/opt/cross-mintelf/m68k-atari-mintelf/include/c++/13.2.0/tuple:1326:36: internal compiler error: Segmentation fault
 1326 |         : _Inherited(__a1, __a2) { }

$ /opt/cross-mintelf/bin/m68k-atari-mintelf-g++ -std=gnu++20  -c tuple.cpp
(no errors, no crash)

$ /opt/cross-mintelf/bin/m68k-atari-mintelf-g++ -std=gnu++20 -malign-int -c tuple.cpp 
during RTL pass: expand
/opt/cross-mintelf/m68k-atari-mintelf/include/c++/13.2.0/tuple: In constructor 'constexpr std::tuple<_T1, _T2>::tuple(const _T1&, const _T2&) [with bool _Dummy = true; typename std::enable_if<std::_TupleConstraints<_Dummy, _T1, _T2>::__is_implicitly_constructible<const _T1&, const _T2&>(), bool>::type <anonymous> = true; _T1 = int&; _T2 = int&]':
/opt/cross-mintelf/m68k-atari-mintelf/include/c++/13.2.0/tuple:1326:36: internal compiler error: Segmentation fault
 1326 |         : _Inherited(__a1, __a2) { }

Then same again, with my toolchain (configured with STRUCTURE_SIZE_BOUNDARY=16)

$ m68k-atari-mintelf-g++ -std=gnu++20 -c d2x-large.cpp 
common/main/player-callsign.h:80:34: error: static assertion failed: callsign_t too big
common/main/player-callsign.h:80:34: note: the comparison reduces to '(10 == 9)'
common/main/player.h:173:33: error: static assertion failed: wrong size player_rw
common/main/player.h:173:33: note: the comparison reduces to '(143 == 142)'
similar/main/scores.cpp:110:34: error: static assertion failed: high score size wrong
similar/main/scores.cpp:110:34: note: the comparison reduces to '(314 == 336)'

$ m68k-atari-mintelf-g++ -std=gnu++20 -malign-int -c d2x-large.cpp 
common/main/player-callsign.h:80:34: error: static assertion failed: callsign_t too big
common/main/player-callsign.h:80:34: note: the comparison reduces to '(10 == 9)'
common/main/player.h:173:33: error: static assertion failed: wrong size player_rw
common/main/player.h:173:33: note: the comparison reduces to '(143 == 142)'

$ m68k-atari-mintelf-g++ -std=gnu++20 -c d2x-small.cpp
(no errors, no crash)

$ m68k-atari-mintelf-g++ -std=gnu++20 -malign-int -c d2x-small.cpp 
during RTL pass: expand
/usr/m68k-atari-mintelf/sys-root/usr/include/c++/13/tuple: In constructor 'constexpr std::tuple<_T1, _T2>::tuple(const _T1&, const _T2&) [with bool _Dummy = true; typename std::enable_if<std::_TupleConstraints<_Dummy, _T1, _T2>::__is_implicitly_constructible<const _T1&, const _T2&>(), bool>::type <anonymous> = true; _T1 = long unsigned int; _T2 = stats_info&]':
/usr/m68k-atari-mintelf/sys-root/usr/include/c++/13/tuple:1326:36: internal compiler error: Segmentation fault
 1326 |         : _Inherited(__a1, __a2) { }

$ m68k-atari-mintelf-g++ -std=gnu++20  -c tuple.cpp
(no errors, no crash)

$ m68k-atari-mintelf-g++ -std=gnu++20 -malign-int -c tuple.cpp 
during RTL pass: expand
/usr/m68k-atari-mintelf/sys-root/usr/include/c++/13/tuple: In constructor 'constexpr std::tuple<_T1, _T2>::tuple(const _T1&, const _T2&) [with bool _Dummy = true; typename std::enable_if<std::_TupleConstraints<_Dummy, _T1, _T2>::__is_implicitly_constructible<const _T1&, const _T2&>(), bool>::type <anonymous> = true; _T1 = int&; _T2 = int&]':
/usr/m68k-atari-mintelf/sys-root/usr/include/c++/13/tuple:1326:36: internal compiler error: Segmentation fault
 1326 |         : _Inherited(__a1, __a2) { }

So to summarize:

some of the assertions are gone with vincents toolchain, but atleast one remains, unless you compile with -malign-int, but that gives ICE.
the ICE i get with -malign-int -c d2x-large.cpp using vincents toolchain seems to be a different one from the others that result from <tuple>
I have no idea why using -malign-int on the large file gives ICE with Vincents toolchain, but does not with mine. At the same time, both give ICE on the smaller example, and on tuple.cpp
the ICE from <tuple> can also be triggered by older toolchains, back to gcc-7.5.0. g++-4.6.4 does not have that header file.
using -malign-int is imho not an option in c++ code, since the libstdc++ was compiled without it

mikrosk commented 1 year ago

the ICE i get with -malign-int -c d2x-large.cpp using vincents toolchain seems to be a different one from the others that result from

It is actually the same cause -- one function calls another, I could easily get this one when I was reducing the code size but still not being finished. So I would count this one as the same.

using -malign-int is imho not an option in c++ code, since the libstdc++ was compiled without it

This is a very good guess, is there a way to quicky verify it? Using -malign-int for gcc CFLAGS during ./configure ?

th-otto commented 1 year ago

Don't know what to want to verify with it. That would be even more troublesome, even for testing purposes only, since you would end up with a libstdc++ that was compiled different than the compilers defaults. You also have to be careful that -malign-int is really only used when compiling the targets libstdc++, not for anything else. Dunno if that is possible at all.

vinriviere commented 1 year ago

I've reduced the dxx-rebirth bug to the bare minimum, and opened the https://github.com/freemint/m68k-atari-mint-gcc/issues/22 issue for it. Please continue the discussion about that specific bug there. Spoiler: it was already present in our old GCC 4.6.4, so that's not specifically caused by the mintelf toolchain.

mikrosk commented 1 year ago

Closing with the outcome that -mshort doesn't enforce the original (16-bit) alignment, therefore default 32-bit alignment is unusable for our target.

-malign-int is a way to do it explicitly but it has gotchas too (one has to be super-careful with linked libraries: they must be either compiled with -malign-int, too or not using any structs as parameters when calling each other).

vinriviere commented 1 year ago

BTW, a -malign-int multilib would make sense (maybe), if we often encounter poorly written software incompatible with m68k-specific struct alignment. That would help giving a quick try, before adding clean struct padding.

mikrosk commented 1 year ago

Hmm... not a bad idea but the rest still stays: you couldn't use gemlib (and perhaps also mintlib, cflib etc) as is as the alignments inside kernel structs would be wrong.

We couldn't even compile the kernel/xaaes with -malign-int so they are using the same padding: other TOS/GEM apps would be passing 16-bit aligned structs as before and therefore wouldn't work.

freemint / m68k-atari-mint-gcc

32-bit alignment for ints #20