Open DiegoMagdaleno opened 2 years ago
As of 3 of Feb 2021, it is now possible to run a statically linked Mach-O binary, given we have the following:
What this means is:
int main() {
return 0;
}
Won't compile, as LLD will complain we are missing the _start
symbols
So if we were to implement our own:
void start(void) __asm__("start"); // Workaround Mach-O prefixing its symbols with a _
int main() {
return 0;
}
void start(void) {
int ret;
ret = main();
}
This will make the Linux kernel go nuts, because sure, we are doing the basic startup routine, we are calling main after all no?, well yeah, but how does the kernel know when to stop executing?, correct, it doesn't, so it tries to execute the new instruction, it looks something like this
|Owned memory| |Other block|
Since we don't exit, the next instruction for our program is on the other_block
of course the Linux kernel doesn't like this!, so it kills us (How responsible).
So how do we fix this?, that's right we need to tell the kernel we finished, there is a very simple way to do this: Implement an exit function, which might look like:
_Noreturn void exit(int code)
{
for (;;) {
asm("mov %0, %%rax\n\t"
"mov %1, %%rdi\n\t"
"syscall\n\t"
:
: "r" ((uint64) SYS_exit),
"r" ((uint64) code)
: "%rax", "%rdi");
}
}
The code above is in charge of first cleaning all the registers that contain return variables, after that, its job is to call our SYS_exit
syscall, once that is done, we tell them what code we want to exit with.
This is how we are able to call exit(ret)
where ret
is the value of our main
function returned.
Since we are properly exiting now, the kernel won't complain we are accessing memory that isn't hours, because suprise suprise, all memory we are accessing is indeed ours.
And this is how we are able to execute Mach-O binaries now.
A small note here!
A lot of the bizarre stuff that comes from porting Mach-O to Linux (without a translation layer) is that we must differentiate into what code is Mach-O or ELF specific, and what code is Darwin or Linux specific, for example:
We use Linux syscalls in our binary above, however, we do some workarounds for Mach-O quirks (the underscore is one of them), so we must really think, is this not working because of the format or is it not working because of the OS?
Nice work!
Thank you! It means a lot, coming from someone with an amazing project such as Airyx!
Utopia is going to be equally amazing :) I love what you're doing with it. I'm still undecided whether airyxOS will adopt Mach-O as the default binary format. It makes some aspects easier but others harder.
Well, Mach-O provides a lot of benefits (Specially to what you're doing, since you're aiming for compatibility), still, if anything in Utopia ends up being useful for Airyx (As in, some implementations or research) it would be great!, ELF like everything has a little bit of quirks, but so does Mach-O is kind of deciding what you want to compromise on.
And also, thank you a lot, you're truly one of the persons that inspired me to make my own OS, and I'm happy you like it :).
Hope one day I can develop my little hobby OS on an Airyx OS powered computer
About
Mach-O (Mach Object) is a replacement for the a.out format, originally used in the Mach operating system, and later it was picked up by Apple for use in XNU.
I really like the Mach-O format and some features it offers, like:
And while the question might be, why not implement those features into ELF?, well, at the end of the day, there is no reason not to try to implement said features in the ELF format, however Utopia is my little "Utopia" and I really like the Mach-O format, specially the way dynamic libraries can have a version embedded into themselves (I don't like the /lib/libwhatever.so.1.2.2).
So this issues tries to track down, what we need to get a Mach-O binary running natively in Utopia!
Definitions
Information
This issue does NOT try to make Utopia binary compatible with macOS, mainly because:
Utopia wants to produce Mach-Os that target the Linux ABI.
Tasks and research
In ELF (unsure if on other formats) the way most operating systems handle dynamic linking and friends, is by having the kernel being capable of loading an static ELF binary (that it doesn't link any other libraries at runtime).
Now when the user executes a linked binary, lets call this one
foo
.The operating system will look for the linker path, that is commonly declared at the ELF header, once this is done, the kernel will open said loader, lets say
/Core/Binaries/linker
the linker will then make sure of reallocating the symbols and everything the binary needs to run, and finally it will call thestart
function, which is really justmain
.Unsure if this would be the case for Mach-O, because no matter what I try it seems like
lld
doesn't want to give me a binary that doesn't request/usr/lib/dyld
which is macOS location of the dynamic linker, however, thanks to @mszoek I know that dyld is a special binary that contains the following header:MH_DYLINKER
this seems to be done with a "special mode", more research is needed about this.This is already possible at the Utopia kernel level, but on the userland we don't any loader yet, we should write our own loader and parser, the current structure includes dividing the codebase into two:
MachOKit: Utopia's Mach-O parsing library
mdlk: While this might be a bad name and might change in the future, it stands for Meme Dynamic LinKer, the name was chosen after @spencer005 called the idea of using Mach-O a "meme"
[ ] Let's get libSystem to fully work and link with Mach-O
While I am happy to announce as of 30 Jan of 2022 we are able to build (but not link!) libSystem, we have a long way to go, before we are able to know if it really works
Right now all compilers make this assumption, Mach-O == macOS/iOS/Darwin, which isn't true anymore, there is Utopia too! (As if it was relevant) we do know some quirks that Mach-O has, like @ facekapow (not mentioning him, because I spammed him a lot on Discord already) Mach-O expects, well the linkers expect, stuff like:
___stack_chk_fail
to be existent when targeting Mach-O, I am up to implementing those on libSystem, however some Darwin-specific behavior (macOS version for example) are things I want to drop.Resources
Of course, we have plenty of resources, to name a few: