alexfru / SmallerC

Simple C compiler
BSD 2-Clause "Simplified" License
1.4k stars 156 forks source link

lfanew format #51

Closed stsp closed 5 months ago

stsp commented 5 months ago

Currently I have to use this tool: https://codeberg.org/tkchia/lfanew on a smallerc-generated binaries. This adds the e_lfanew field at offset 0x3c. But as a bonus, the overlay information block is enlarged, and spans from 0x1c to 0x3c. I use that space because I have a lot of an overlay data.

Would it be possible to support the lfanew format natively? This will give people more overlay space, and you can use e_lfanew field to point to the next section.

alexfru commented 5 months ago

If I get it right, you want a bigger MZ header in DOS .EXEs, as big as in PE .EXEs?

stsp commented 5 months ago

Correct, I use up all its overlay space. But you can use e_lfanew pointer to locate your own overlay, if there is one. I am not very familiar with the smallerc's exe structure, but at least gcc-ia16 (of the same author of that lfanew tool) uses the patched causeway extender that relies on that e_lfanew field if it is there, and if its not there it finds the overlay by other techniques.

I'll attach the example binary that I produce with smallerc+lfanew. You can see that it has the large header and the overlay info space if fully used. comcom64.exe.gz

stsp commented 5 months ago

Btw, let me take a chance to congratulate you with writing such an amazing tool. :) Of course gcc-ia16 is very advanced. But its build time and the amount of deps makes it completely unsuitable for inclusion to any distro. And with smallerc - I only asked, and it was immediately included to debian. Just because its that simple to build!

The only big downside I've found, was the lack of 64bit pointers support in smallerc. FreeDOS crowd also wants far pointers. If this is done, you can try to even kick an ass of gcc-ia16. :)

alexfru commented 5 months ago

I have a few questions:

  1. From that comcom64.exe binary you shared it looks like you're trying to build parts of DJGPP and your app into an ELF instead of a COFF and start that ELF using SmallerC's DPMI stub + your ELF loader? Why all of this, why not just use DJGPP? And you're still gonna use a VM, if not for DJGPP then surely for your comcom64.exe?
  2. If you use a Windows/PE .EXE produced by SmallerC, will that work out of the box with the lfanew tool?
  3. If you're using this with DPMI only, you can link with a modified dpstub.exe, would this help? (e.g. smlrcc -dosp hw.c -stub lfadpstb.exe -o hwdplfa.exe, where lfadpstb.exe is dpstub.exe with an enlarged MZ header (I simply took the first 0x820 bytes of your comcom64.exe as lfadpstb.exe and it worked))
  4. I'm not sure what you mean by 64-bit pointers. Do you mean far pointers in 32-bit protected mode? Do you mean far pointers in real mode? 64-bit mode? The huge memory model's pointers can address memory up to 1MB address (UMA is OK, but not HMA). They're just 20-bit physical pointers that occupy 32 bits and get converted to seg16:ofs16 at runtime. This should be enough for most uses in real mode. Can you just use this?
stsp commented 5 months ago

Why all of this, why not just use DJGPP?

DJGPP only produces 32bit code. Besides of this, distros do not pick it up as its very difficult to build. But its not like I don't use djgpp at all: while normally I avoid it, comcom64 can be built with djgpp from the very same source tree (in which case it is comcom32 then). On top of this all, I can debug comcom64 with host's gdb, while debugging djgpp apps is probably more difficult.

And you're still gonna use a VM

I don't know who runs DOS w/o a VM today. :)

If you use a Windows/PE .EXE produced by SmallerC, will that work out of the box with the lfanew tool?

I never tried that. And even if I want to... would I need Windows for that experiment? How can I run PE?

If you're using this with DPMI only, you can link with a modified dpstub.exe, would this help?

I think this would help, as I only need to modify the stub with lfanew. But what is lfadpstb.exe? I guess this is something we can get during the curse of solving this ticket, rather than something already available in smallerc?

I'm not sure what you mean by 64-bit pointers.

Sorry, I mistyped. 64-bit TYPES. AKA "long long".

alexfru commented 5 months ago

You can probably simply feed dpstub.exe into lfanew tool to get lfadpstb.exe?

64-bit types are hard to support in 32-bit mode just like 32-bit types on a 16-bit CPU. There's pretty much just one internal type of the size of a CPU register, into which whatever fits (char, short, int, long, void*, float) is available and whatever doesn't isn't.

stsp commented 5 months ago

You can probably simply feed dpstub.exe into lfanew tool to get lfadpstb.exe?

Yes, that's what I do. I have that modified stub, which I then use for linking. But this is an unnecessary tool, unnecessary dep. And also why don't you think e_lfanew pointer can be used by smallerc's stub itself to locate the next part on an image?

64-bit types are hard to support in 32-bit mode just like 32-bit types on a 16-bit CPU.

I understand, but gcc-ia16 does that, and even in 16bit modes.

stsp commented 5 months ago

Ah, no, that's not what I do. I feed the resulting program to lfanew, but why does that matter?

stsp commented 5 months ago

OK so basically yes: if smallerc provides just a stub with an enlarged header, that would be enough. But doesn't this mean that smlrl needs to support that natively just to produce such stub?

stsp commented 5 months ago

I mean, running lfanew on smallerc's side doesn't help, as then its still a redundant dep. It would be better if smlrl can do that natively.

stsp commented 5 months ago

So to be absolutely clear: the goal is to get rid of lfanew tool. Whether to apply it to a stub or to the program - doesn't matter, as in both cases lfanew dep is still there.

alexfru commented 5 months ago

SmallerC isn't gcc. It's very simple and therefore in many ways limited. The limitations are partially balanced out by the size, portability and a few interesting options. There's no intent to build more on top of this limited foundation. It's awkward to extend.

So, if you don't use the lfanew tool, you still need to use something else, perhaps, a bit simpler? I mean, how are you going to set and use .e_lfanew in the header?

If I make dpstub.exe use .e_lfanew for itself, then how is your stuff gonna work together with dpstub.exe?

stsp commented 5 months ago

SmallerC isn't gcc.

That's fine with me as I've found the way to use it anyway. For me it replaced gcc-ia16 (by the cost of some large porting efforts, but I like the result), but for others its not.

how are you going to set and use .e_lfanew in the header?

I don't need it. I only need the empty space, where I store the pointers and overlay names. In case of gcc-ia16, e_lfanew field was used by its own stub, not by me, to locate the protected-mode section. Your exe surely has a protected-mode section as well, and your real-mode stub looks it up too?

If I make dpstub.exe use .e_lfanew for itself, then how is your stuff gonna work together with dpstub.exe?

Because for my stuff it doesn't matter where exactly to put pointers and names. I put them before e_lfanew, and e_lfanew is used by the stub's own needs.

alexfru commented 5 months ago

Got it. I can extend the MZ header by a fixed size, just enough to include the .e_lfanew field (and the empty relocation/overlay space before it, where you can store your bits of data).

dpstub.exe knows where it ends, it reads its own MZ header to figure out where the rest begins. The DPMI exe is a simple concatenation of three things: dpstub.exe, the stub info (16 dwords, first of which is "DP$!", then stack size, heap min size, heap max size and then all zeroes), a 32-bit a.out. dpstub.exe rarely needs to change.

stsp commented 5 months ago

dpstub.exe knows where it ends, it reads its own MZ header to figure out where the rest begins.

This is what gcc-ia16's stub does IIRC if the exe header size is too small to include e_lfanew. But you don't need such a variety, and AFAICS just reading e_lfanew is easier than calculating the exe size from other header fields. But it works both ways, and for me only the empty space is important. I detect lfanew header by checking "header size" word (offset 8) to be 4.

alexfru commented 5 months ago

Can you try out the branch with the change? If it works for you, I'll get it into master.

stsp commented 5 months ago

No, it doesn't work. But its my fault, as I mis-informed you. :( I said I do not use 0x3c, but that appears wrong. lfanew aligns the stub size to 16 bytes (by zero-padding) and puts the resulting size to 0x3c. And I use that, rather than the dpmi stub of gcc-ia16, as I was wrongly claiming previously.

But it would be trivial to amend this misunderstanding, right?

alexfru commented 5 months ago

If you're going to store something in the reserved space, you can set the value at 0x3c just as well.

I don't think alignment is important for your purpose (fread() or its equivalent wouldn't care).

Can you just take the 64-byte MZ header and alter its contents to your taste? It's much less work than general-purpose lfanew would do. All you need to do is get the file size, write it together with whatever other data you have into the header and append your ELF64.

About the only thing you need to check is "MZ" and the header size and fail if it's not "MZ" or the size isn't 64 bytes (or any other multiple of 16 greater than 64).

I'm not quite willing to invest my time into rather obscure requests especially when you already have a working solution.

stsp commented 5 months ago

But shouldn't e_lfanew field always point to the end of the stub? But anyway, you are right, I can as well do that myself. I'll see how it goes.

stsp commented 5 months ago

I added the needed code (basically 1 line). So as soon as you commit this change, I'll drop lfanew.

stsp commented 5 months ago

And of course it failed to build on LP because of this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94391 Lets start adding work-arounds...

alexfru commented 5 months ago

And looks like this is done.