freemint / m68k-atari-mint-gcc

Fork of GNU's gcc with support for the m68k-atari-mint target
https://github.com/freemint/m68k-atari-mint-gcc/wiki
Other
26 stars 7 forks source link

Possible problems with new mintelf toolchain #18

Closed th-otto closed 9 months ago

th-otto commented 1 year ago

In your configuration, you use m68kelf.h as target-machine specific file. Amongst others, this will redefine M68K_STRUCT_VALUE_REGNUM to a0, which will introduce an incompatibility (it is not redefined again in mint.h). Especially it is incompatible with libgcc.a which assumes a1. Same goes for STATIC_CHAIN_REGNUM.

There might be other traps. Eg. m68kelf.h defines ASM_OUTPUT_BEFORE_CASE_LABEL is defined using the swbeg directive which is only valid for svr4

th-otto commented 1 year ago

An option would be to build a different crt.o for the different multilibs.

Yes, that needs to be done, even if the code is identical. I'm already working on this.

NB: .cpu 68k doesn't seem to hurt the old a.out gas, so we could add it to crt0.S unconditionally.

Maybe, but i think its cleaner have multiple crt0.o.

Another problem that i just encountered:

ld: BFD (GNU Binutils for MiNT ELF 20230818) 2.41.0 assertion fail ../binutils-2.41/bfd/elf32-atariprg.c:142
collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core dumped

That is from https://github.com/freemint/m68k-atari-mint-binutils-gdb/blob/368ff0c03fb7cbd072349ccc261b47f3bc9c2e6a/bfd/elf32-atariprg.c#L142 and happened because there were accidental some a.out object files lying around. Of course its ok to abort the link in this case, but maybe with a nicer message ;)

th-otto commented 1 year ago

Hm, am i doing something wrong?

$ m68k-atari-mintelf-objdump -f lib020/crt0.o
architecture: m68k, flags 0x00000011:

Shouldn't that be m68k:m68020?

vinriviere commented 1 year ago

Well, I don't know. I only tested 68000 and ColdFire. And as @mikrosk said that -68020-60 was broken, I'm lost. BTW, for more accurate information, use "m68k-atari-mintelf-readelf -h crt0.o". Note that ELF has a single flag from 68000 to 68060, this is certainly the explanation. readelf tells 68000, "file" tells 68020, but I think that's the same thing.

mikrosk commented 1 year ago

And as @mikrosk said that -68020-60 was broken

It was broken for compilers producing non-68000 code by default. So your and Thorsten's builds are always fine.

th-otto commented 1 year ago

I don't think that is has to do with the default cpu. Same happens when i invoke as directly. Also tested with the m68k-linux-gnu toolchain, which behaves the same.

I have changed mintlib to produce different crt0.o now, and successfully compiled it with an ld which had the hack removed.

Still some work to be done for gdb to properly support double. First of, i still have no idea where that feature tests about "org.gnu.gdb.m68k.core" etc. come from. Secondly, the guessing of gdb where float values are returned is still wrong. We could change gcc to spit out a .gnu_attribute for that, i'm currently testing this. Or we could just change m68k-tdep.c accordingly.

th-otto commented 1 year ago

Implemented both now in my toolchain, and seems to work:

Screenshot_20230826_100936

However the output from readelf is a bit confusing (foo.o was compiled for -m68020-60)

$ m68k-atari-mintelf-readelf -A -S foo.o
Attribute Section: gnu
File Attributes
  Tag_GNU_M68K_ABI_FP: soft float

The output is from https://github.com/freemint/m68k-atari-mint-binutils-gdb/blob/368ff0c03fb7cbd072349ccc261b47f3bc9c2e6a/binutils/readelf.c#L17217-L17228

But the value in the attribute is correct (2), and what gdb would expect.

th-otto commented 1 year ago

Looks like my backports of a.out & stabs aren't that bad:

Screenshot_20230826_192445

So this is an executable compiled by my a.out toolchain that still produces a.out-mintprg format. The gdb is one from a mintelf toolchain, with Vincents patches and a bfd library that supports both a.out & elf. Result:

Screenshot_20230826_192131

et voilà ;)

th-otto commented 1 year ago

Just tried write the debug info into a separate file. Normally, this can be done with something like

m68k-atari-mintelf-objcopy --only-keep-debug a.out a.out.debug

However i get an error:

m68k-atari-mintelf-objcopy: a.out.debug: DATA segment start address 0x0001a5a6 must be 0xffffffe4 to match its file offset
m68k-atari-mintelf-objcopy: a.out.debug: bad value

Guess there is something still wrong when bfd opens an prg-elf file fore reading?

Edit: following patch seems to fix that:

--- a/bfd/elf32-atariprg.c
+++ b/bfd/elf32-atariprg.c
@@ -572,7 +572,7 @@ fix_phdrs (bfd *abfd)
     }

   /* Check that VMA of DATA segment matches its file offset.  */
-  if (phdr_data->p_vaddr != phdr_data->p_offset - FILE_OFFSET_TEXT)
+  if (phdr_data->p_offset != 0 && phdr_data->p_vaddr != phdr_data->p_offset - FILE_OFFSET_TEXT)
     {
       _bfd_error_handler ("%pB: DATA segment start address " ADR_F " must be " ADR_F " to match its file offset",
                          abfd, (adr_t) phdr_data->p_vaddr,

Edit2: after that fix, and when you add a debug link to the original file (also using objcopy), reading debug symbols from that separate file seems to work:

Reading symbols from a.out...
Reading symbols from a.out.debug...

Time to enable debug builds in mintlib etc again ;)

vinriviere commented 1 year ago

Guess there is something still wrong when bfd opens an prg-elf file fore reading?

No, PRG/ELF works fine for reading. Proofs are objdump/strip/gdb. Your error message mentions the output file "a.out.debug". This is because PRG/ELF is only suitable to store standard executables. You must specify plain ELF as output type to store the debug-info only, like this:

m68k-atari-mintelf-objcopy --only-keep-debug a.out a.out.debug -O elf32-m68k

th-otto commented 1 year ago

Yes, that works, too. But with the above fix, objcopy works, and gdb is also happily reading that file. Atleast, if that should not be supported, the error message should indicate that.

BTW, objcopy --only-keep-debug does not seem to remove the section headers for .text etc. Thats strange, but is also the case for the host tools and not a bug of the mintelf target.

vinriviere commented 1 year ago

I have changed mintlib to produce different crt0.o now, and successfully compiled it with an ld which had the hack removed.

Excellent job, @th-otto 😃 Now we could add conditional compilation in crt0.S based on the CPU, if there is some need for that. I've updated my Ubuntu binaries with the new MiNTLib, and also removed the linker hack allowing to mix different CPU types. I can still build ColdFire executables, as expected. And the map file shows that the correct crt0.o is used.

Does this have any effect on on the gdb/float issue?

th-otto commented 1 year ago

Yes, it does. bfd now no longer thinks that m68k and coldfire are compatible ;) And as you can seenin one of the screenshots above, the 96bit double format is now used.

Telling gdb where doubles are returned is another thing though (in the default configuration it thinks they are returned in FP registers, but the are returned in d0/d1 in our target). I'm using the attached patch for that (it also selects the cfv4e as default coldfire target). I just don't know yet how to test whether that worked as expected ;) There is some code that checks that setting, but i have no idea how you can test that in gdb. You can only test the values of variables, but for that is does not matter where the return value came from.

m68k-tdep.patch.txt

vinriviere commented 1 year ago

Yes, it does. bfd now no longer thinks that m68k and coldfire are compatible

Good.

You can only test the values of variables, but for that is does not matter where the return value came from.

Exactly. I wondered the same thing when I wanted to test the struct return ABI. Where does GCC use that information? That's puzzling.

th-otto commented 1 year ago

It is used all m68k-tdep.cc (look for pointer_result_regnum/struct_value_regnum)

vinriviere commented 1 year ago

After that, window looks much better:

Indeed @th-otto: I've just tried your terminfo fix, and it works beautifully on the st52 console 😃 I notice that the upper screen is still messed after stepping over a printf(), it requires ^L. But I noticed that Cygwin's gdb behaves just the same. So that's not a MiNT issue.

I haven't tried colors yet.

image

th-otto commented 1 year ago

Yes, the scrolling when the child print something happens also on linux. But that seems to be normal since the tui mode does not have a separate panel for console output.

Haven't got colors working yet in tui mode, even when trying to convince it that tw100 has 256 colors like xterm.

What remains a mint issue though that i stiil get that "stairs effect" at certain places. Not nice, but i guess this is still some problem in mints tty driver emulation, and/or because toswin runs on a pipe rather than a pseudo tty like on linux.

mikrosk commented 1 year ago

You can try ConHolio for a test, it aims to emulate 256c xterm out of the box.

th-otto commented 1 year ago

With the default setting (TERM=linux) it does not look better:

Screenshot_20230827_201630

When trying to set TERM=xterm-256color, it hangs on startup.

Is there a way to get a larger terminal than 25x80?

Edit: RTFM

vinriviere commented 1 year ago

Haven't got colors working yet in tui mode, even when trying to convince it that tw100 has 256 colors like xterm.

Remember that I have disabled the colors by default in gdb to avoid trouble. Use set style enabled on at gdb startup. It seems that it's time to revert that defensive patch 😄

This is TosWin2 in tw52 mode. tw52

th-otto commented 1 year ago

Oh, must have missed that patch. Looks ok to me.

But conholio really seems to be good alternative. Seems to be much faster than toswin2, and much more responsive. Any chance to get at the sources?

Edit: also disabled the hack in ada-lang.c. Any idea why this was needed? I don't get any message when starting up.

Edit2: the command window is colored, but the source code does not use syntax highlighting. However i think that is done by a python script, so maybe better leave that off on our poor machines ;)

th-otto commented 1 year ago

BTW, i'm currently trying to get gdbserver working. It already compiles and runs without crashing, there is only one small problem: setting breakpoints does not seem to work yet ;)

Screenshot_20230828_072010

th-otto commented 1 year ago

Source highlighting could also be provided by the GNU highlight-source library, unfortunately that in turn needs some boost libraries :(

mikrosk commented 1 year ago

But conholio really seems to be good alternative. Seems to be much faster than toswin2, and much more responsive. Any chance to get at the sources?

@shoggoth77 seems like praise/question to you ;-)

the source code does not use syntax highlighting. However i think that is done by a python script, so maybe better leave that off on our poor machines ;)

For syntax highlighted debugging I use cgdb which seems to be in C++.

th-otto commented 1 year ago

I don't think that this will work on mint. It uses pseudo-ttys to communicate with gdb. On linux, there is also not much differerence between cgdb and gdb --tui

th-otto commented 1 year ago

Just as reminder: we also still have to figure out why gas turns a jbsr to a label which is defined in the same source, but also global, into a bsr.l or jsr. That defeats the purpose of using that pseudo instruction to optimize the code, as it currently does the opposite.

vinriviere commented 1 year ago

Just as reminder: we also still have to figure out why gas turns a jbsr to a label which is defined in the same source, but also global, into a bsr.l or jsr.

Ha! I found it. Fixed. https://github.com/freemint/m68k-atari-mint-binutils-gdb/commit/4e3b0bc011152a5e94c53e800d457c6a58390776

I guess that now, FreeMiNT's assembler files don't need manual fixing on bra.s anymore. Not tested.

th-otto commented 1 year ago

Nice find. Without the patch, EXTERN_FORCE_RELOC is defined here:

https://github.com/freemint/m68k-atari-mint-binutils-gdb/blob/4e3b0bc011152a5e94c53e800d457c6a58390776/gas/config/tc-m68k.h#L87-L88

So it might have worked for the bare elf target already, but not for mintelf.

th-otto commented 1 year ago

I'm still not entirely sure, but i think gdb needs to know where a return value originates from, when the function throws an exception. In that case, references might have to updated etc.

BTW, our gcc predefines __GCC_HAVE_DWARF2_CFI_ASM only when you compile with -g. The host gcc on my x86_64 box predefines it always.

mikrosk commented 1 year ago

Without the patch, EXTERN_FORCE_RELOC is defined here

Shouldn't we add mint / mintelf there rather than redefining it on our own?

th-otto commented 1 year ago

No, i think redefining it in target specific header files is the right thing to do. That's what they are for.

th-otto commented 1 year ago

Ha! I found it. Fixed.

There must be another setting involved with that. The old (a.out) assembler optimizes code like https://github.com/emutos/emutos/blob/7c65e47c33bcce4f4a5f29664bf3522718414542/bios/startup.S#L488 into pc-relative addressing. The elf assembler does not do that. In emutos, that makes a difference of ~300 bytes in the 192k roms.

Maybe that is because for elf, the label "zero" is placed into a different section (.rodata)?

th-otto commented 1 year ago

Another strangeness: for the 192k roms (when compiled with the a.out toolchain), the end of bios/startup.o is:

[00fc014c] 2050                      movea.l    (a0),a0
[00fc014e] 66de                      bne.s      $00FC012E
[00fc0150] 4e75                      rts
[00fc0152] 0000                      dc.w       $0000
[00fc0154] 0000                      dc.w       $0000
[00fc0156] 0000                      dc.w       $0000

The first two 0000 are from the zero variable. But the last one is just padding, despite SUBALIGN(2) being used in the linker script. Could be that this is a result from the a.out object file already being padded to a multiple of 4?

th-otto commented 1 year ago

Ok, both assumptions seem to be true. In an attempt to be able to compare the objects generated i hacked the definitions a bit, defining SECTION_RODATA to .text also for elf, and changing the SUBALIGN(2) to SUBALIGN(4) to match a.out (and also the fill value to 0). Unfortunately, that only helped for the first few files. In later C-code, constant sections (and also string constants like "PATH") are placed into the .text section by a.out, but into .rodata by the elf compiler, which makes them appear in different order in the output. And also as a side-effect, the assembler will not be able to optimize references to such labels to pc-relative.

Fixing that would require to add an option to allow specifying the name of the .rodata section. From what i've seen, only the arc port seems to have something like this already: https://github.com/freemint/m68k-atari-mint-gcc/blob/2363dea5cbe65984e61c845272b1e24941ff4893/gcc/config/arc/arc.h#L713-L714 but only in older versions.

th-otto commented 1 year ago

Did a quick test do implement such an option, and it seems to work. You would have to specify -mrodata=.text (the new option) and also -fno-merge-constants because otherwise string constants are put into separate sections).

Without the SUBALIGN hack that would mean you'll save about 130 bytes of code, and about 98 bytes of bss just by using the elf toolchain. Using -flto saves another ~4.8k of code.

But now i encounter another strange difference between a.out and elf, which causes the binaries still not being identical. When using the a.out toolchain, i get:

$ m68k-atari-mint-nm -n ./obj/floppy.o | grep ' [cCbBdD] '
00001148 b _cur_dev
0000114c b _loopcount_fdc
00001150 b _loopcount_toggle
00001154 b _deselect_time
00001158 b _finfo
00001184 b _motor_on

cur_dev is a short, and loopcount_fdc is an unsigned long. For some strange reason, loopcount_fdc is aligned to a 4-byte address. That does not have to do with structure alignment since both are just independant local variable, or with a.out object file alignment because they are both from the same source. Do you also get that result, or did i somehow mess up my compiler during the tests?

Edit: even more strange:

$ m68k-atari-mint-nm -n ./obj/gem_rsc.o | grep ' [cCbBdD] '
000016f4 b _msg_str
000017c4 b _msg_but

msg_str is an array of 5*41 = 205 chars. msg_but is an array of 3*21 = 63 chars. Still, msg_but seems to be aligned on the next 4-byte boundary? In the elf toolchain i get

00000844 b msg_str
00000911 b msg_but

Edit2: i'm getting a bit desperate. In the following example:

char msg_str[5][40+1];
char msg_but[3][20+1];

char *test(void)
{
    return msg_str[0];
}

char *test2(void)
{
    return msg_but[0];
}

with the a.out toolchain, i get :

000000dd B _msg_but
00000010 B _msg_str
00000000 T _test
00000008 T _test2

So msg_but was not aligned. But when i declare the arrays as static, i get:

000000e0 b _msg_but
00000010 b _msg_str
00000000 T _test
00000008 T _test2

and msg_but is aligned to 4-byte boundary. WTF? What am i missing here?

vinriviere commented 1 year ago

Maybe that is because for elf, the label "zero" is placed into a different section (.rodata)?

Yes.

th-otto commented 1 year ago

and msg_but is aligned to 4-byte boundary. WTF? What am i missing here?

Damn, i guess found the reason. -fno-common is only used for public symbols (https://github.com/freemint/m68k-atari-mint-gcc/blob/2363dea5cbe65984e61c845272b1e24941ff4893/gcc/c-decl.c#L4042-L4047). When using static variables, still ".lcomm" (for local common) is used. common symbols are put internally into separate sections (even for a.out). When linking, those sections are then padded to the 4-byte alignment of a.out object files. That will make it difficult to verify whether code compiled by the elf tool chain is identical to the one compiled by a.out.

vinriviere commented 1 year ago

BTW, about .lcomm, I added a specific configuration to gas 2.41 in order to avoid irrelevant big alignments:

https://github.com/freemint/m68k-atari-mint-binutils-gdb/blob/b066005ad2c9dcb04568ea132c224f4b0ffd3d15/gas/config/te-mintelf.h#L24-L35

And also, a specific patch to ELF: https://github.com/freemint/m68k-atari-mint-binutils-gdb/commit/72e768942f31a9156df8b5896294f2ff7783486d#diff-4ee42d9ac4bd0e257587637cc045f06d59d94f900db925800381a889fd9f9548L8899-R8942

Specially, there is a manual usage of .lcomm in the MiNTLib, which caused an unexpected whole BSS segment alignment to 16 bytes (not bits). The above TC_IMPLICIT_LCOMM_ALIGNMENT configuration fixes that. https://github.com/freemint/mintlib/blob/master/unix/vfork.S#L13

.lcomm has the side effect to request a big alignment for big buffers. As we don't need/want that for out mint code, I think that we should replace it to a simple .ds.b in the above MiNTLib source.

However, I wansn't aware that .lcomm was also used for the COMMON section.

th-otto commented 1 year ago

We can change that in assembler files of course, but the compiler emits that nevertheless for static, uninitialized variables.

The padding in this case is not from the .lcomm (i have that patch too in binutils), but because of the general padding to 4 bytes in a.out object files. Since that is part of the section size, it cannot even be removed again by the linker using SUBALIGN.

Atleast for emutos, this isn't even a big problem. The difference is maybe a few bytes in the bss size. And i have a version now where even that size is identical. Still, there are single variables inside, that reside at different addresses, which generates lots of differences when you compare the binary output.

vinriviere commented 1 year ago

Your error message mentions the output file "a.out.debug". This is because PRG/ELF is only suitable to store standard executables. You must specify plain ELF as output type to store the debug-info only, like this:

m68k-atari-mintelf-objcopy --only-keep-debug a.out a.out.debug -O elf32-m68k

@th-otto: I've just patched objcopy to select automatically the correct output format in that case, without extra option: https://github.com/freemint/m68k-atari-mint-binutils-gdb/commit/0a96c0deb7143834f322c3f1fa388517d7fd3025

Now you can use the standard syntax. The debug file will automagically be created in plain ELF format.

m68k-atari-mintelf-objcopy --only-keep-debug a.out a.out.debug

And I've rebuilt my Ubuntu binutils.

th-otto commented 1 year ago

Thanks. I can't even remember what i did to make gdb load our format (or even find the message where i mentioned this), but that is certainly the way users would expect it to work.

Edit: found it above in the comments. Stupid github was hiding some messages by default...

mikrosk commented 9 months ago

Is there something left to discuss from the issues mentioned? Most of them were solved or put into separate issues (thanks Vincent). Otherwise I suggest to close this one, too.

th-otto commented 9 months ago

Whether to use PCC_BITFIELD_TYPE matter or STRUCTURE_SIZE_BOUNDARY still has to be decided. But we can discuss that in the just opened issue.

All other issues are already fixed i think, or atleast we agreed about the settings.

vinriviere commented 9 months ago

Whether to use PCC_BITFIELD_TYPE matter or STRUCTURE_SIZE_BOUNDARY still has to be decided. But we can discuss that in the just opened issue.

Yes, this was the purpose. And sorry for the delay, I was busy with other things. But I still plan to make additional tests very soon to completely understand the bitfields issue in https://github.com/freemint/m68k-atari-mint-gcc/issues/35. Then we will have all the elements to decide.

All other issues are already fixed i think, or atleast we agreed about the settings.

Yes 😃 After that bitfield question, everything will be decided regarding to the mintelf ABI. I don't plan to add anything to GCC, current version seems to be good enough. I will rebuild the packages I generally provide with my cross-tools. Then we will have to finish gdb support (already good IIRC, except gdbserver still to be done). And that will be enough for me.

Regarding to this issue, maybe we could rename it to General discussion about the mintelf toolchain and keep it opened? Because we will always have something to say. Of course it is better to open separate issues, but when problems aren't clearly identified yet, we have to discuss somewhere. The MiNT Mailing List could be another option.

Otherwise, this issue could be closed. I don't mind.

mikrosk commented 9 months ago

Ok then, I'm going to close it. I'm not a fan of issues with long threads, so as you say - if there is something to discuss, we just open a new issue or post to ML for broader discussion.