moosichu / zar

An attempt to write an archiver using zig
MIT License
34 stars 6 forks source link

Object parsing and symbol table writing #22

Open iddev5 opened 3 years ago

iddev5 commented 3 years ago

POC at ranlib branch for elf files (gnu format only) right now. It uses object parsing code from zld directly with minimal changes to compile without unneeded files, but maintains compatibility with upstream zld.

iddev5 commented 3 years ago

Pretty much any change beyond this point would require some help from Jakub. I will ask him soon. Anyways, valid symbol tables are generated for simple linux (elf) and mac (macho) archives. Need more testing on a bit larger projects.

iddev5 commented 3 years ago

FreeBSD seemed to follow GNU format completely, tested FreeBSD 12 on QEMU.

iddev5 commented 3 years ago

This PR is practically complete. They are of course places for optimization, improvements and edge cases, but they can be solved later as per found/needed.

Both reading and writing parts also handle symbols different way. In reading, symbols are on archive level, in writing symbols are added to the file itself. Its debatable which is better/needs testings.

moosichu commented 3 years ago

Can you make please make a PR? Happy to help with resolving differences in how symbols are handled.

iddev5 commented 3 years ago

Coff symbols is being worked at coff-sym branch.

Also submitted the coff parser as a PR to zld.

moosichu commented 3 years ago

Just realised that this is missing bitcode files (the primary use case for llvm-ar), is this something that you would like me to look at?

iddev5 commented 3 years ago

Bitcode files?

Anyways feel free to do that, I m a bit busy these days, can't say when I can contribute back in my full capacity.

moosichu commented 3 years ago

OK, I might be a tad more bullish about touching more of the code base to get features completed until then if I have the time this coming week. Hope all is going well with you !

iddev5 commented 3 years ago

Just updated the coff-sym branch and rebased. Should be okay to merge now.

Anyways, I am not sure what you mean by bitcode files?

moosichu commented 3 years ago

Oh sorry, llvm IR basically (an encoding of it) https://llvm.org/docs/BitCodeFormat.html

It’s an important use case so we will need to support it.

iddev5 commented 3 years ago

Right that makes sense. To my knowledge, llvm IR was stored in native object formats, and even though I haven't tested, I expect them to have a symbol table, in which case current implementations would work. But since you mentioned this link, and after briefly checking it, there seems to be another format, need to do more research on how it works and then make an object parser for it.

moosichu commented 3 years ago

Cheers, let me know if you do get onto this becuase otherwise I might do. It also might be worth liasing with Jakub if you do so becuase the linker needs bitcode parsing as well and it hasn't been implemented there yet either - so the work could be transferrable.

iddev5 commented 3 years ago

Yeah thats a great idea, though I am unsure where to integrate it (ofcouse Jakub would know)

On the other hand, I just now started some work on sorted symbol table: writing/sorted-table

iddev5 commented 3 years ago

@moosichu I am a bit lost right now. It seems that native mach archives (the ones in test/data/test5) are generated quite a bit differently in that they are in non-deterministic mode by default. Did you used any additional flags or my assumption is correct?

If so, I wonder how to deal with, like maybe create a new ArchiveType darwin which is basically bsd + U flag + sorted table and other minor differences, and then have it as default type on MacOS?

Also, it seems that sorted table implementation works fine. I have enabled it by default on all types and OS as it is harmless.

moosichu commented 3 years ago

It should only be considered as a test case for parsing. It's a native archive, i.e. one generated by the native OS archived on Darwin. Not LLVM ar.

iddev5 commented 3 years ago

Oh well that's great then. I will wrap up everything soon then.