windelbouwman / ppci

A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python
https://ppci.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
337 stars 36 forks source link

WIP format.elf.file: ElfFile: Allow to save relocatable files. #61

Closed pfalcon closed 4 years ago

pfalcon commented 4 years ago

With these changes, a file which can be dumped by readelf is written, but of course there's junk inside.

pfalcon commented 4 years ago

So, considering why I use ppci less than I'd want, I have to admit to myself there're 2 reasons:

  1. Discovery that IR (well, IL) is quite mouthful: https://github.com/windelbouwman/ppci-mirror/issues/43
  2. NIH "world from scratch" approach. No, it's absolutely great, for as long as somebody did the "boring" parts. Otherwise it makes project spread thin and self-deadlock on progress in one area to enable playing with other areas. For example, I'm rather disinterested in stuff like linking, and wish I could play with compiling some python and maybe playing with optimizations and regalloc. And now I need to wait for a guy to come around who waited whole his life to write a linker.

It doesn't have to be like that. And the biggest culprit is own "object file format" of PPCI. There should be a way to convert that to an ELF .o file. That automagically solves the problem of lining with existing sources and libraries, dynamic linking, etc. So, today I sat for half an hour and so what it would take to implement it, given that currently PPCI supports only converting to ELF executable. With some patching around, I was able to make it at least write out a file which could be dumped with readelf --all. But insides are mostly wrong. The main issue is that writing out of ET_EXEC concentrates on program segments ("images" in PPCI parlance), while for ET_REL, there's none such, and it should concentrate on writing out sections. And of course, for ET_REL should write out relocations, in a suitable format, put into an appropriate section.

I decided to just dump whatever I had and call for your, @windelbouwman, consideration. If you agree with "risk analysis" above and agree that ELF .o is missing link, I may imagine for you, as the author of code, it won't be too much effort to add proper ET_REL save support. If not, I'll continue this homework as time permits.

windelbouwman commented 4 years ago

Hi! Absolutely agree on supporting spitting out ELF object files, as well as windows PE object files, so that they can be included in visual studio builds.

There are two use cases here:

  1. generating these object files from ppci to other tools
  2. loading and linking these object files, as generated from other tools in ppci

I would like to focus on option 1 for now, since most part are already there for the case of ELF. One thing that might be missing as well, is a command line switch to be able to choose from the object format you want to use. This command line switch should be present on most utilities generating object files (not executables).

windelbouwman commented 4 years ago

Btw, the linker might seem like a boring task, but after looking into it, it actually is more interesting than you might think :)

windelbouwman commented 4 years ago

I'll prepare an example for this

windelbouwman commented 4 years ago

Okay, I got a very basic sample working (see last commit), but the elf file code is a mess, I'll refactor it now.

pfalcon commented 4 years ago

I would like to focus on option 1 for now

Thanks!

One thing that might be missing as well, is a command line switch to be able to choose from the object format you want to use

Yep. Switches may need some elaboration in general, e.g. https://github.com/windelbouwman/ppci-mirror/issues/38

Btw, the linker might seem like a boring task, but after looking into it, it actually is more interesting than you might think :)

I fully agree, it can be "boring" only as a blocker to other tasks someone may want work on. I'm sure there's someone around who may be interested to work on linking, there's no irony here, we just need to find that person (which reminds that it may be a good idea to cut a release, as it's good opportunity for more PR). (And sure, the other day year I might be interested to work on a linker, it's just I'm on SSA an other things for 7th year, and I crack them first :-D ).

Okay, I got a very basic sample working (see last commit), but the elf file code is a mess, I'll refactor it now.

Thanks! Will give it a try soon.

windelbouwman commented 4 years ago

Note that external functions cannot be imported, since this must use the PLT, which I have not figured out yet.

pfalcon commented 4 years ago

Note that external functions cannot be imported, since this must use the PLT, which I have not figured out yet.

Well, either I miss your point, or I wouldn't think that PLT is involved in any way. PLT is boring(tm) implementation detail of ELF shared libraries, and the whole idea we want to generate relocatable ELF .o files is to offload such stuff to an existing linker.

But you're right that external symbols aren't exported properly in generated ELF .o file currently, my quick patch re: that is:

--- a/ppci/format/elf/writer.py
+++ b/ppci/format/elf/writer.py
@@ -274,10 +275,14 @@ class ElfWriter:
             entry = self.header_types.SymbolTableEntry()
             entry.st_name = self.string_table.get_name(symbol.name)
             entry.st_info = (int(st_bind) << 4) | int(st_type)
-            entry.st_shndx = self.section_numbers[symbol.section]
-            entry.st_value = (
-                symbol.value + self.obj.get_section(symbol.section).address
-            )
+            if symbol.section is None:
+                entry.st_shndx = 0
+                entry.st_value = 0
+            else:
+                entry.st_shndx = self.section_numbers[symbol.section]
+                entry.st_value = (
+                    symbol.value + self.obj.get_section(symbol.section).address
+                )
             entry.write(self.f)

         symbol_table_index_first_global = len(local_symbols) + 1

That still leads to:

./a.out: Symbol `printf' causes overflow in R_X86_64_PC32 relocation
Segmentation fault (core dumped)

at runtime (or more specifically, at dyna-linking time, from ld.so)

pfalcon commented 4 years ago

However, gcc -static main.c call.o works as expected. So yeah, R_X86_64_PC32 is way too optimistic relocation type to use. You can use that only for functions defined in the same module. For any external (i.e. with just a prototype) function, you should assume that it may reside anywhere in the address space.

pfalcon commented 4 years ago

Well, yeah, PLT is somehow involved after all, in gcc-compiled object:

Relocation section '.rela.text' at offset 0x218 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000007  000500000002 R_X86_64_PC32     0000000000000000 .rodata - 4
00000000000c  000b00000004 R_X86_64_PLT32    0000000000000000 puts - 4
windelbouwman commented 4 years ago

Yeah, I noticed as well that gcc emits a PLT relocation for external functions. Note that with printf (the first function you would naturally try to use), there are vararg parameter passing issues, since the vararg parameter passing is implemented not on stack, but in a seperate memory slab. So, a better function to try out would be putchar to get this to work.

windelbouwman commented 4 years ago

Note, that this blog series contains many interesting details. https://www.airs.com/blog/archives/38

pfalcon commented 4 years ago

Well, yeah, PLT is somehow involved after all, in gcc-compiled object:

A quick-hack to address that:

--- a/ppci/format/elf/writer.py
+++ b/ppci/format/elf/writer.py
@@ -263,7 +263,8 @@ class ElfWriter:
         self.f.write(bytes(symtab_entsize))

         for nr, symbol in enumerate(local_symbols + global_symbols, 1):
-            self.symbol_id_map[symbol.id] = nr
+            print(symbol)
+            self.symbol_id_map[symbol.id] = (nr, symbol)

             if symbol.binding == ir.Binding.GLOBAL:
                 st_bind = SymbolTableBinding.GLOBAL
@@ -318,8 +323,9 @@ class ElfWriter:

             for rel in reloc_groups[section_name]:
                 assert rel.section == section_name
-                r_sym = self.symbol_id_map[rel.symbol_id]
-                r_type = self.get_reloc_type(rel.reloc_type)
+                r_sym, sym = self.symbol_id_map[rel.symbol_id]
+                print("!", rel.section, rel.symbol_id, r_sym, sym)
+                r_type = self.get_reloc_type("_external" if sym.section is None else rel.reloc_type)
                 if self.elf_file.bits == 64:
                     r_info = (r_sym << 32) + r_type
                 else:
@@ -385,6 +391,7 @@ class ElfWriter:
             "abs64": R_X86_64_64,
             "abs32": R_X86_64_32,
             "absaddr64": R_X86_64_64,
+            "_external": R_X86_64_PLT32,
         }
         return elf_reloc_mapping[reloc_type]
$ gcc main.c call.o
...
$ ./a.out 
Hello from the other side!

@windelbouwman, hopefully, I was in time for these quick hacks to be useful ;-). Handing over to you now for proper implementation.

pfalcon commented 4 years ago

Note that with printf (the first function you would naturally try to use), there are vararg parameter passing issues, since the vararg parameter passing is implemented not on stack, but in a seperate memory slab.

Are you sure? In C, function prototypes are optional. In particular, there's no rule that "vararg function should be declared before called". That means, that from callers' perspective, calling a vararg func is no different than any other function. For example, for x86_64, first few args will be passed in regs, only latter ones on stack, etc. So, headache of implementing of va_list and friends lies on the side of the actual function. That's why va_list for x86_64 is such a bloat - because it apparently spills registers to memory area within va_list (I actually started to read up on that last week by unrelated matters, but didn't finish, so all the above is intuitive understanding ;-) ).

windelbouwman commented 4 years ago

Well, yeah, PLT is somehow involved after all, in gcc-compiled object:

$ ./a.out Hello from the other side!



@windelbouwman, hopefully, I was in time for these quick hacks to be useful ;-). Handing over to you now for proper implementation.

Nicely done! Is discovered the same, when calling a function externally (undefined), this should have reloc of PLT type. Pretty sweet that this works out!

windelbouwman commented 4 years ago

Okay, fixed that issue, this is going well! At least, there is some example working, not sure if this is 100% fool proof.

pfalcon commented 4 years ago

Okay, fixed that issue, this is going well! At least, there is some example working, not sure if this is 100% fool proof.

Great, thanks! Seems to work for me too, will play a bit more a prepare a sample to include in upstream.

Closing this otherwise.