tkchia / gcc-ia16

Fork of Lambertsen & Jenner (& al.)'s IA-16 (Intel 16-bit x86) port of GNU compilers ― added far pointers & more • use https://github.com/tkchia/build-ia16 to build • Ubuntu binaries at https://launchpad.net/%7Etkchia/+archive/ubuntu/build-ia16/ • DJGPP/MS-DOS binaries at https://gitlab.com/tkchia/build-ia16/-/releases • mirror of https://gitlab.com/tkchia/gcc-ia16
GNU General Public License v2.0
173 stars 13 forks source link

Is there any way to avoid the linker offsetting the VMA by the size of the header in .EXE files? #82

Open andrewbird opened 2 years ago

andrewbird commented 2 years ago

I was looking at the FreeDOS kernel compilation with gcc-ia16 last week and noticed that it uses the same style of linker script as gcc-ia16 also uses to produce a .EXE file. After some discussion near the end in this PR https://github.com/dosemu2/dosemu2/pull/1540 it seems that the VMA addresses in the .MAP file are offset by the size of the .EXE header. This means that when loading the .MAP file into Dosemu's dosdebug the origin is not as expected 0x0060, but 0x0055 since the header is 0xb long. This is not such a problem with the FreeDOS kernel as we just have to know that the Watcom .MAP file is loaded at 0x0060 and the GCC ia16 .MAP file is to be loaded at 0x0055 and whilst not intuitive it's not a great problem. However when debugging applications it's nice to be able to load the map file using the CS register as the origin, but in the GCC ia16 case we have to subtract 0xb from CS. Is there any way for GCC-ia16 to generate the .EXE header outside of the linker script, or otherwise avoid offsetting the VMA addresses in the linker script?

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

Well, yeah — the whole handling of segments in the gcc-ia16 toolchain is rather hacky, even if it happens to work.

I think the whole thing can really use a huge overhaul, but the problem is, the path from here to there will be messy. Recently, for the ELKS target platform, I had modified the toolchain to

It should be possible to do something similar for the MS-DOS target — transition to the segelf scheme, and write a separate elf2mz (?) program to generate MZ binaries. But we will most probably need buy-in from at least the FreeDOS kernel and FreeCOM projects, since their linker scripts currently rely on the "old" linker internals. Another wrinkle is that — as far as I know — Anvin has for some reason not yet implemented the segelf scheme in nasm. (Support has been added into ia16-elf-as, though.)

Let me know your thoughts.

Thank you!

andrewbird commented 2 years ago

A most interesting read! I'm happy to help, where able, with build and testing. Regarding the FreeDOS kernel and FreeCOM, there's not really much happening apart from that being done by @perditionc and of course @bartoldeman. I don't know of any GCC ia16 kernel build being used outside of the GitHub Actions test for buildabilty and my own Dosemu2 testing suite. Is it possible though to prove this out on standalone .EXE binaries first without breaking what's being used for the kernel and freecom? Thank you!

PerditionC commented 2 years ago

I have no issue with updating the kernel and command.com build as needed, provided there is guidance and/or help with what needs to done.

andrewbird commented 2 years ago

Looking at https://github.com/jbruchon/elks/blob/4cf198434b9/elks/tools/elf2elks/elf2elks.c I think I could probably write something similar to produce an MZ executable, though I'd be starting from a zero understanding of the segelf scheme.

Anvin has for some reason not yet implemented the segelf scheme in nasm. (Support has been added into ia16-elf-as, though.)

NASM seems to be the really big one, primarily because you've already done what's required to the ia16 tools, without it though I doubt this scheme can be very useful for building the FreeDOS kernel, FreeCOM, or many of the FreeDOS tools. I'm not sure what the next step for integration into NASM should be as their mailing list and bug trackers seem to be very low traffic. I did see your issue https://bugzilla.nasm.us/show_bug.cgi?id=3392533 but unfortunately the conversation seems to have stalled. There is some sporadic unrelated commit activity at https://github.com/netwide-assembler/nasm/commits/master so people are still working on the project. Perhaps developer time is very limited at the moment.

Thank you!

andrewbird commented 2 years ago

@hpax I wonder if you've made any progress towards the SEGELF support in nasm, as we could really use it now?

Thanks, Andrew.

andrewbird commented 2 years ago

Just so we have the link to hand I found that Stas was enquiring on the NASM forum recently as well. https://forum.nasm.us/index.php?topic=2747.0

andrewbird commented 2 years ago

@tkchia whilst we are waiting for a response from Peter on the SEGELF support in NASM, I was wondering if it was possible to switch to .exe production for the kernel using the elf_i386_msdos_mz scheme? What was missing from that, were some fields in the header just incorrect (and maybe we could correct them with a post-process with info from the .map file), or something more fundamental?

andrewbird commented 2 years ago

I was looking at the NASM code. To cooperate with your segelf support do you think that the only required changes to NASM will be in the ELF output stage? I found this test program within https://bugzilla.nasm.us/show_bug.cgi?id=3392694 written by @ecm-pushbx. I have zero experience with NASM development, but do you think a new segelf target is the best option or perhaps you favour modifying the existing elf backend?

ecm-pushbx commented 2 years ago

Hi, my username on here is actually @ecm-pushbx but yes I made that test case to lay out what NASM needs. I believe the segelf extensions were supposed to be added to the elf format.

andrewbird commented 2 years ago

Hi @ecm-pushbx, yes I now realise your proper username is not what I typed. I guess my lapse is a consequence of Github automatically filling it in for me most of the time. Thanks for creating the test case, it'll certainly be most useful for me. I'll have a stab at modifying the ELF output, not sure how far I'll get with my limited knowledge, but let's see.

Thank you!

andrewbird commented 2 years ago

@tkchia I thought I'd do a few experiments around writing the MZ EXE header differently. Before I start I want to create a simple test with ia16-elf-gcc and understand the current output. Here's my little attempt https://github.com/andrewbird/test-exe. When I compile/link it and examine the header I'm seeing something unexpected (to me anyway), that is the stack segment @ 0x10548. Here's the header in the .MAP file

.msdos_mz_hdr   0x0000000000000000       0x20                                   
                0x0000000000000000                __msdos_mz_hdr_start = .      
                0x0000000000000000        0x2 SHORT 0x5a4d                      
                0x0000000000000002        0x2 SHORT 0x20 ((LOADADDR (.data) + SIZEOF (.data)) % 0x200)
                0x0000000000000004        0x2 SHORT 0x30 (((LOADADDR (.data) + SIZEOF (.data)) + 0x1ff) / 0x200)
                0x0000000000000006        0x2 SHORT 0x1 __msdos_mz_rels         
                0x0000000000000008        0x2 SHORT 0x2 __msdos_mz_hdr_paras    
                0x000000000000000a        0x2 SHORT 0xf68 (((0x10000 - SIZEOF (.data)) - ADDR (.data)) / 0x10)
                0x000000000000000c        0x2 SHORT 0xf68 DEFINED (__msdos_handle_v1)?0xffff:(((0x10000 - SIZEOF (.data)) - ADDR (.data)) / 0x10)
                0x000000000000000e        0x2 SHORT 0x10548 ((((LOADADDR (.data) / 0x10) - __msdos_mz_hdr_paras) - (ADDR (.data) / 0x10)) + 0x10000)
                0x0000000000000010        0x2 SHORT 0x0                         
                0x0000000000000012        0x2 SHORT 0x0                         
                0x0000000000000014        0x2 SHORT 0x20 _start                 
                0x0000000000000016        0x2 SHORT 0xfffe ((((LOADADDR (.text) / 0x10) - __msdos_mz_hdr_paras) - (ADDR (.text) / 0x10)) + 0x10000)
                0x0000000000000018        0x2 SHORT 0x1c (__msdos_mz_rel_start - __msdos_mz_hdr_start)
                0x000000000000001a        0x2 SHORT 0x0                         
 *(.msdos_mz_hdr .msdos_mz_hdr.*)                                               
                0x000000000000001c                __msdos_mz_rel_start = .      
 *(.msdos_mz_reloc .msdos_mz_reloc.*)                                           
 .msdos_mz_reloc.0                                                              
                0x000000000000001c        0x4 /usr/lib/x86_64-linux-gnu/gcc/ia16-elf/6.3.0/../../../../../ia16-elf/lib/libdos-s.a(dos-msmmabort.o)
                0x0000000000000020                __msdos_mz_rel_end = .        
                0x0000000000000001                __msdos_mz_rels = ((. - __msdos_mz_rel_start) / 0x4)
                0x0000000000000020                . = DEFINED (__msdos_handle_v1)?ALIGN (0x200):.
                0x0000000000000002                __msdos_mz_hdr_paras = (((. - __msdos_mz_hdr_start) + 0xf) / 0x10)
                0x0000000000000020                . = ALIGN (0x10)              
                0x0000000000000001                ASSERT ((((__msdos_mz_rel_end - __msdos_mz_rel_start) % 0x4) == 0x0), Error: MZ relocations are not 4-byte aligned)
                0x0000000000000001                ASSERT ((__msdos_mz_rels <= 0xffff), Error: too many MZ relocations)

How can this be, as the field width is only 16 bits?

tkchia commented 2 years ago

Hello @andrewbird,

The value of 0x10548 will be truncated to 0x0548 in the final output. This should correspond to the relative paragraph offset of the .data segment in the output (which should be the same as the initial stack segment; the startup code later sets %ds from %ss).

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

(Incidentally, in case you are curious, __msdos_handle_v1 is a symbol that is defined if ia16-elf-gcc was asked (-mmsdos-handle-v1) to output an executable that fails gracefully on MS-DOS 1.x, rather than crash. This option will change the layout of the MZ header, among other things. My linker script source file comments have some further discussion on this.)

Thank you!

andrewbird commented 2 years ago

Hi @tkchia, Ahh, I'd seen that the value was truncated in the actual header, but I hadn't realised it was intentional. Here's the output from my little header printer script.

$ ./prnhdr.py 
test-std.exe: MZ header OK!
  Bytes in last page:                 0x0020
  Number of pages (inc last):         0x0030
  Number of relocation entries:       0x0001
  Header size (paragraphs):           0x0002
  Min. Memory allocated (paragraphs): 0x0f68
  Max. Memory allocated (paragraphs): 0x0f68
  Initial Stack Segment:              0x0548
  Initial Stack Pointer:              0x0000
  Checksum (0 for none):              0x0000
  Initial Instruction Pointer:        0x0020
  Initial Code Segment:               0xfffe
  Offset of relocation table:         0x001c
  Overlay number:                     0x0000

So moving on to my creating a elf2mz program, what gcc/ld options should I use to create a suitable input elf file?

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

For now I guess you can try to force the output format to elf32-i386 instead of binary, by using a -Wl,--oformat=elf32-i386 option. This should yield a ELF file which will include the MZ header (as program data), the various ELF section headers and program headers, etc. You can try to poke around the ELF output to decide what to do next.

Thank you!

tkchia commented 2 years ago

Note that — as I mentioned previously — as of now, the linking of .exe files still uses the LMA ≠ VMA representation scheme, not Anvin's segelf.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia,

For now I guess you can try to force the output format to elf32-i386 instead of binary, by using a -Wl,--oformat=elf32-i386 option. This should yield a ELF file which will include the MZ header (as program data), the various ELF section headers and program headers, etc. You can try to poke around the ELF output to decide what to do next.

Thanks, that's exactly what I need!

Note that — as I mentioned previously — as of now, the linking of .exe files still uses the LMA ≠ VMA representation scheme, not Anvin's segelf.

I had been thinking that I might have to turn that on with -msegelf, but then there's the problem with no nasm support. If something is achievable as is, then I'll probably stick with that.

Thank you!

ecm-pushbx commented 2 years ago

Hello @andrewbird,

(Incidentally, in case you are curious, __msdos_handle_v1 is a symbol that is defined if ia16-elf-gcc was asked (-mmsdos-handle-v1) to output an executable that fails gracefully on MS-DOS 1.x, rather than crash. This option will change the layout of the MZ header, among other things. My linker script source file comments have some further discussion on this.)

Thank you!

Huh, didn't know that MS-DOS 1.xx had any support for MZ executables.

tkchia commented 2 years ago

Hello @ecm-pushbx,

Huh, didn't know that MS-DOS 1.xx had any support for MZ executables.

The code for handling MZ files was in command.com, rather than the kernel. And yes, the support was pretty crappy. :neutral_face:

Thank you!

lpsantil commented 2 years ago

At least now MS DOS 1.25 could be fixed like how you fixed GW-BASIC? :D

https://github.com/microsoft/MS-DOS

tkchia commented 2 years ago

Hello @andrewbird,

For now I guess you can try to force the output format to elf32-i386 instead of binary, by using a -Wl,--oformat=elf32-i386 option.

Another tip: if in addition you say -Wl,-r, the linker (ia16-elf-ld) will not try to resolve relocations, but will instead leave them around as ELF relocations in the output file. You can then study them with ia16-elf-objdump -D -r ... (e.g.) or ia16-elf-readelf -r ...

Hello @lpsantil,

At least now MS DOS 1.25 could be fixed like how you fixed GW-BASIC? :D

Well, not sure there is much point in doing that though. :no_mouth:

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, Thanks for the info, I'm sure it will be most useful. I'm currently thinking of doing the calculations in the linker script just as you do now. But instead of writing the header from there, I hope to set some new private variables with the values that I can read in elf2mz and so write out header. That way I hope to keep the logic in the linker script instead of splitting it between linker script and elf2mz. I may have to start again with elf2mz at some point, as for now I switched to -msegelf purely because the linker script was easier for me to understand! I know we are waiting for NASM support to make elf2mz really useful, but I figured it would be interesting (for me at least) to see how far I could go with this.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, I switched back to the lma != vma method rather than -msegelf. Can you tell me which linker script is in operation when I issue this command, please?

ia16-elf-gcc -Wall -mcmodel=small -Os -o $@ $< -li86 -Wl,-Map=test-std.map

I tried /usr/ia16-elf/lib/dos-exe-small.ld but I'm not seeing the same sizes / offsets when using the two commands

ia16-elf-gcc -Wall -mcmodel=small -Os -o  test-new.o -c $<
ia16-elf-gcc -o test-new.elf test-new.o -T test-new.ld -li86 -Wl,-Map=test-new.map -Wl,--oformat=elf32-i386

Thank you!

andrewbird commented 2 years ago

Hello, @tkchia, Sorry to keep spamming you! I have now figured out that I am using the right linker script for a base.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia,

I don't understand why the functions being linked in are different according to which output is used e.g.

std compile/link in one operation
.text          0x0000000000000244     0x114e /usr/lib/x86_64-linux-gnu/gcc/ia16-elf/6.3.0/../../../../../ia16-elf/lib/libc.a(lib_a-vfiprintf.o)
               0x0000000000000244                _vfiprintf_r                  
               0x000000000000137a                vfiprintf                     

and

compile, then link
.text          0x0000000000000221     0x2278 /usr/lib/x86_64-linux-gnu/gcc/ia16-elf/6.3.0/../../../../../ia16-elf/lib/libc.a(lib_a-vfprintf.o)
               0x0000000000000221                _vfprintf_r                   
               0x0000000000002481                vfprintf                      

My latest attempt is here, in case anyone can spot my mistake https://github.com/andrewbird/test-exe/tree/vma-ne-lma

At present it seems to me that my separate compile then link operations are not equivalent to the single operation. Until they are my efforts to compare the header values are worthless.

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

OK — the problem (?) lies in the -T option you are using in the second link.

gcc-ia16 (together with newlib-ia16 and binutils-ia16) has some special hacks for automatically detecting whether the program needs a floating-point-capable stdio or can do with a non-floating-point stdio (-mnewlib-autofloat-stdio).

This feature partly relies on gcc-ia16 inserting an additional linker script (actually, via an -lastdio option — where the .a file is a script). This works, but is obviously kind of messy. :neutral_face: At the moment, if you explicitly specify a linker script via -T, then gcc-ia16 will assume that you want to use just that linker script, and will forgo the special hacks.

I guess one way to get around this is to also explicitly specify a -T option in the first link (!). Maybe try something like

... -T "`ia16-elf-gcc --print-file-name=dos-mssl.ld`" ...

where the backquoted part will output the path to the default small model linker script.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia,

That really helped, so now the output is looking similar.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia,

I got a little further. Can you tell me how the segment value of the relocations is generated, as I seem to be missing it.

ia16-elf-gcc -Wall -mcmodel=small -Os -o test.o -c test.c
ia16-elf-gcc -Wall -mcmodel=small -o test-std.exe test.o -T "`ia16-elf-gcc --print-file-name=dos-mssl.ld`" -li86 -Wl,-Map=test-std.map
gcc -o elf2mz elf2mz.c -lelf
ia16-elf-gcc -Wall -mcmodel=small -o test-new.elf test.o -T test-new.ld -li86 -Wl,-Map=test-new.map -Wl,--oformat=elf32-i386
./elf2mz -i test-new.elf -o test-new.exe  # options not parsed yet
./elf2mz: ELF section 0x1 -> text section
./elf2mz:   virt. addr. 0, size 0x99e0, file offset 0x1000
./elf2mz: ELF section 0xc5 -> data section
./elf2mz:   virt. addr. 0, size 0xa90, file offset 0xb000
./elf2mz: ELF section 0xc6 -> msdos_mz_tail section
./elf2mz:   virt. addr. 0xa90, size 0x10, file offset 0xba90
./elf2mz: ELF section 0xc7 -> BSS section
./elf2mz:   virt. addr. 0xaa0, size 0xc3be, file offset 0xbaa0
./elf2mz: ELF section 0xc8 -> symtab section
./elf2mz:   virt. addr. 0, size 0x1ec0, file offset 0xbaa0
./elf2mz: 0 text reloc(s)., 0 far text reloc(s)., 0 data reloc(s).
./elf2mz: created temporary file `./JrDl13'
./prnhdr.py
test-std.exe: MZ header OK!
  Bytes in last page:                 0x00b0
  Number of pages (inc last):         0x0053
  Number of relocation entries:       0x0001
  Header size (paragraphs):           0x0002
  Min. Memory allocated (paragraphs): 0x0f53
  Max. Memory allocated (paragraphs): 0x0f53
  Initial Stack Segment:              0x099c
  Initial Stack Pointer:              0x0000
  Checksum (0 for none):              0x0000
  Initial Instruction Pointer:        0x0020
  Initial Code Segment:               0xfffe
  Offset of relocation table:         0x001c
  Overlay number:                     0x0000
Relocations:
  fffe:98e0
test-new.exe: MZ header OK!
  Bytes in last page:                 0x0070
  Number of pages (inc last):         0x0053
  Number of relocation entries:       0x0001
  Header size (paragraphs):           0x0002
  Min. Memory allocated (paragraphs): 0x0f57
  Max. Memory allocated (paragraphs): 0x0f57
  Initial Stack Segment:              0x099c
  Initial Stack Pointer:              0x0000
  Checksum (0 for none):              0x0000
  Initial Instruction Pointer:        0x0000
  Initial Code Segment:               0xfffe
  Offset of relocation table:         0x001c
  Overlay number:                     0x0000
Relocations:
  0000:98c0

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

Look at bfd_i386_elf_get_paragraph_distance (...) in bfd/elf32-i386.c in binutils-ia16.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, So it look like my relocation segment became zero because I deleted the mz header section. I'll try just emptying it and see if it's happy with that. Thank you!

andrewbird commented 2 years ago

Hello @tkchia, I added the mz header section back in, but when empty it was removed by the linker anyway. If I add a test short then it does get the section included in the output elf, but the relocation segment value was still missing. I'm wondering if the offset part of the relocation is correct anyway, is it's now not measured from the beginning of the .exe header. but from the start of the text section. I'd hoped I could reuse the tweaks you've added into the linker without any modifications, but I'm not so sure now.

It's late here, I'll have another look tomorrow.

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

It looks like you changed the start address of the csegvma memory block from 0x00020 to 0x00000?

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, Yes I figured that since the linker was not responsible for writing the header the addresses should be zero based. In fact that's my whole reason for this experiment, for the symbols in the .map file to not be offset by the mz header size. I can see now that it breaks the offsets in the relocations because they are calculated by the linker, so I may have to think again...

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, I'm not sure what I was doing last night, but adding the .msdos_mz_hdr section back in does add the segment values into the relocations. However they are not the .text segment value, presumably because of where I placed the .msdos_mz_hdr section as didn't want it in the executable. So moving forward, should I be linking with -Wl,-r and producing the relocations myself in elf2mz? Is that what you do when making elks executables? Is there anything else I'd need to do myself if linking with -r? Is it worth spending time on this vma != lma scheme, or should I switch (again) to segelf? I suppose the FreeDOS kernel can't use it without NASM support for segelf. Ultimately I'd like this to be useful, not just some interesting experiment for me.

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

Well, generally, I think the approach I am using for ELKS will be the way to go. The old way is rather prone to breakage.

In particular, one major thing I did for -melks, was to switch from binutils's default BFD linker to the gold linker.


Anyway, you can see the command lines of the various GCC passes if you pass a -v option to ia16-elf-gcc -melks.

To really tweak the GCC passes' command lines, you will probably — at some point — need to edit the spec strings in gcc/config/ia16/elf.h in the gcc-ia16 source tree. (The spec string macros and spec string syntax are documented under info gccint and info gcc.)

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

I suppose the FreeDOS kernel can't use it without NASM support for segelf.

Hmm... this is still an issue. But I suspect we can try to work around the lack of segelf support at some point (even if it means post-processing Microsoft OMF files into ELF, or something similar).

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, I came back to this for a little more experimentation. I flipped again back to -msegelf scheme in light of your suggestion that the lack of NASM support might be worked around.

I tried switching to the gold linker by adding -fuse-ld=gold, but ended up with a linker internal error and no output.

When compiling / linking(bfd linker) my test program I'm using the -msegelf switch, but I don't see any relocations in the output .elf unless I add -Wl,--emit-relocs or -r. Even with -Wl,--emit-relocs I'm still seeing vma != lma style relocations 000098c0 00005250 R_386_OZSEG16 00000e50 abort_stack Do the libraries have to be made differently to see segelf style relocation?

test-new.o: test.c                                                              
        ia16-elf-gcc -Wall -mcmodel=small -msegelf -Os -o $@ -c $<              

test-new.exe: test-new.o elf2mz                                                 
        ia16-elf-gcc -Wall -mcmodel=small -msegelf -msegment-relocation-stuff -o test-new.elf $< -T test-new.ld -li86 -Wl,-Map=test-new.map -Wl,--oformat=elf32-i386 -Wl,--emit-relocs
        ./elf2mz -i test-new.elf -o $@  # options not parsed yet                

I have so far been trying to do this without building gcc, binutils and libi86 myself, just modifying the linker script and adding switches to the compile / link command lines of the PPA obtained tools. Is that too hopeful?

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

Do the libraries have to be made differently to see segelf style relocation?

Yes, unfortunately. Currently only the libraries for the ELKS target have been pre-compiled to use the segelf scheme.

I guess meantime you can probably try to use the libgcc portion of the ELKS libraries for tests — these are quite platform-independent and should also work with MS-DOS. (libgcc contains internal support routines used by GCC-generated code.) For the main libc portion though, you will probably need to rebuild newlib-ia16, or whip up something on your own. There should be no need at all to rebuild binutils-ia16.

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

Anyway, which Ubuntu distribution are you currently using? Maybe I will see if I can work out something in the PPA that can ease the experimentation on your end. I happen to be also working on revamping the DPMI stuff (I hope to move it to segelf and get it to work with an existing DOS extender).

Thank you!

andrewbird commented 2 years ago

Hello @tkchia,

Anyway, which Ubuntu distribution are you currently using?

I'm using Ubuntu Focal (x86_64).

Maybe I will see if I can work out something in the PPA that can ease the experimentation on your end.

That would be great, as I'm floundering a little. I'm only looking for enough support to compile/link a very minimal small model program like Hello World!, and I think to be useful I'd need to see at least one msdos relocation required. Currently with the vma!=lma scheme, I'm seeing that relocation via abort(), but I'm sure it could be anything for my testing.

I happen to be also working on revamping the DPMI stuff (I hope to move it to segelf and get it to work with an existing DOS extender).

I don't want to delay you too much with my woes.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, Following on from your suggestion to look at ELKS, when running this command

ia16-elf-gcc -Wall -mcmodel=small -melks -o test-new.elf $< -Wl,-Map=test-new.map

I find the output is

test-new.elf: Linux-8086 executable, A_EXEC

So I presume after linking the elf2elks program was run. If so, is it possible to prevent this so I can look at the ELF output from ld.gold? In this case would the linker script used be /usr/ia16-elf/lib/elkslibc/elks-small.ld?

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

If so, is it possible to prevent this so I can look at the ELF output from ld.gold?

There is a (currently still undocumented and unofficial) -mno-post-link option which will do this.

In this case would the linker script used be /usr/ia16-elf/lib/elkslibc/elks-small.ld?

That should be the case. Again, you can try passing -v to ia16-elf-gcc. This will show which exact linker script is being used.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia,

There is a (currently still undocumented and unofficial) -mno-post-link option which will do this.

That worked a treat, thanks!

That should be the case.

Yep, it was.

Again, you can try passing -v to ia16-elf-gcc.

Sorry I keep forgetting this!

Thank you

tkchia commented 2 years ago

Hello @andrewbird,

I had been working on getting gcc-ia16 's DPMI mode (now available through -mdosx) to use the segelf ABI, and also to use Binutils's gold linker.

Right now, ia16-elf-gcc -mdosx does not really support segments beyond the near text and near data segments. But if you add the options

i.e. ia16-elf-gcc -dosx -msegment-relocation-stuff -mno-post-link then you can examine the relocations in the intermediate ELF output. (You can also -v and -Wl,-Map=..., as before.)

Thank you!

tkchia commented 2 years ago

Hello @andrewbird,

By the way, if -msegment-relocation-stuff is enabled, then there are a number of ways you can force the compiler to emit segment relocations. E.g. if you say

int __far x = 1;
int __far *y = &x;

then gcc-ia16 will create a far variable x outside the near data segment, and also create a pointer variable y to point to x at startup — and the value of y will have to involve a segment relocation.

You can also create far functions, and take their addresses or directly call them. Something like

__far void
foo (void)
{
  ...
}

void __far (*p_foo) (void) = foo;

will create foo () as a far function which resides in the default text segment, but returns via retfw to a far address. You can also say

__attribute__ ((far_section)) __far void
bar (void)
{
  ...
}

which will create bar () as a far function which will be placed outside the default text segment, etc.

Thank you!

andrewbird commented 2 years ago

Hello @tkchia, I'm on holiday at the moment so I won't get the chance to try this out straight away, however it will be one of the first things I want to do when I get back.

Thank you!

tkchia commented 2 years ago

@andrewbird : well, have a nice holiday, and stay safe! :slightly_smiling_face:

hpax commented 2 years ago

I just noticed there has been some discussions here... and there was some reference to "my" segelf scheme. I guess I wasn't quite clear where the segelf discussion ever landed, and haven't exactly have much time to consider any of this, but if there is now some consensus about what The Right Thing is it would definitely make me more motivated to find the time.

asiekierka commented 2 years ago

@hpax It is implemented in gcc-ia16, I've been using it to good effect as part of my efforts to target the WonderSwan (an 80186-based game console from the late 90s, of all things), and I believe @tkchia maintains DOS/ELKS support for it for a more "classic" use case. I don't think any major issues have come up with it? But it would be best to wait for TK to answer.

hpax commented 2 years ago

Well, the reason I'm asking is that I believe there have been more than one proposal, and wanted to make sure we were consistent which one was "mine".