Open iddev5 opened 3 years ago
Pretty much any change beyond this point would require some help from Jakub. I will ask him soon. Anyways, valid symbol tables are generated for simple linux (elf) and mac (macho) archives. Need more testing on a bit larger projects.
FreeBSD seemed to follow GNU format completely, tested FreeBSD 12 on QEMU.
This PR is practically complete. They are of course places for optimization, improvements and edge cases, but they can be solved later as per found/needed.
Both reading and writing parts also handle symbols different way. In reading, symbols are on archive level, in writing symbols are added to the file itself. Its debatable which is better/needs testings.
Can you make please make a PR? Happy to help with resolving differences in how symbols are handled.
Coff symbols is being worked at coff-sym branch.
Also submitted the coff parser as a PR to zld.
Just realised that this is missing bitcode files (the primary use case for llvm-ar), is this something that you would like me to look at?
Bitcode files?
Anyways feel free to do that, I m a bit busy these days, can't say when I can contribute back in my full capacity.
OK, I might be a tad more bullish about touching more of the code base to get features completed until then if I have the time this coming week. Hope all is going well with you !
Just updated the coff-sym branch and rebased. Should be okay to merge now.
Anyways, I am not sure what you mean by bitcode files?
Oh sorry, llvm IR basically (an encoding of it) https://llvm.org/docs/BitCodeFormat.html
It’s an important use case so we will need to support it.
Right that makes sense. To my knowledge, llvm IR was stored in native object formats, and even though I haven't tested, I expect them to have a symbol table, in which case current implementations would work. But since you mentioned this link, and after briefly checking it, there seems to be another format, need to do more research on how it works and then make an object parser for it.
Cheers, let me know if you do get onto this becuase otherwise I might do. It also might be worth liasing with Jakub if you do so becuase the linker needs bitcode parsing as well and it hasn't been implemented there yet either - so the work could be transferrable.
Yeah thats a great idea, though I am unsure where to integrate it (ofcouse Jakub would know)
On the other hand, I just now started some work on sorted symbol table: writing/sorted-table
@moosichu I am a bit lost right now. It seems that native mach archives (the ones in test/data/test5) are generated quite a bit differently in that they are in non-deterministic mode by default. Did you used any additional flags or my assumption is correct?
If so, I wonder how to deal with, like maybe create a new ArchiveType darwin
which is basically bsd + U flag + sorted table and other minor differences, and then have it as default type on MacOS?
Also, it seems that sorted table implementation works fine. I have enabled it by default on all types and OS as it is harmless.
It should only be considered as a test case for parsing. It's a native archive, i.e. one generated by the native OS archived on Darwin. Not LLVM ar.
Oh well that's great then. I will wrap up everything soon then.
POC at ranlib branch for elf files (gnu format only) right now. It uses object parsing code from zld directly with minimal changes to compile without unneeded files, but maintains compatibility with upstream zld.
[x] ELF symbols
[x] MachO symbols
[x] PE/COFF symbols
(currently blocked by missing upstream support from zig/zld)[X] GNU & Thin format
[X] GNU64 format
[x] BSD format
[x] Darwin (BSD64) format (llvm doesn't seem to support this format)
[ ] Microsoft ECOFF extended sorted symbol table (DELAYED)
[x]
FreeBSD style archives with GNU symbol table but BSD string format (but does this even exists?)(not needed, see below)[ ] LLVM bitcode object file symbols
[x] Support sorted symbol table
BSD and Darwin formats(Availble for all)[ ] Investigate consequences of not having a symbol table in darwin (32 and 64)