Open vit9696 opened 4 years ago
PE files with symtab stripped and present After looking at the code I discovered that the section is not actually anonymous, as /4 is just a way to specify an offset in the symbol table. However, different tools may actually strip this information upon deployment, which will result in an unnamed section and in llvm-readobj being unable to parse such file with an "Invalid data was encountered while parsing the file" error.
Even if we assume the tools are not working correctly, I believe llvm-readobj should still work fine with such files, and CodeView section still needs to be updated.
I attached a couple of sample files for reference:
James, I checked GNU objcopy, and it matches llvm-objcopy behaviour.
I guess, the same request could be filed for GNU objcopy as well, but we do not have a particular need in it as we do not try to generate PE files directly with GNU toolchain.
I'm not sufficiently knowledgable about COFF to be able to look at this in any detail, but I did have one question about point 1:
What does GNU objcopy do about the section name for .gnu_debuglink?
Some points I noted from adding COFF gnu-debuglink support to LLDB:
--build-id
, which is enabled by default in LLD but disabled by default in GNU ld.bfd.Personally, gnu-debuglink has been working in a satisfactory manner for Krita on Windows (mingw-w64). I don't know the EFI target, but if some external tools are mangling the files in a way which makes the .gnu_debuglink section unreadable, can you instead use the build ID to look up the matching debug file?
Extended Description
After stripping DWARF debug information one can use --add-gnu-debuglink to link the resulting file with the original file containing the debug information:
$ cp file.dll file.debug $ llvm-objcopy --strip-unneeded file.dll $ llvm-objcopy --add-gnu-debuglink=$(pwd)/file.debug file.dll
When working with PE/COFF files this functionality is unfortunately very limited: https://github.com/llvm/llvm-project/blob/f69eba07726a9fe084812aa224309d62c4bdd2e4/llvm/tools/llvm-objcopy/COFF/COFFObjcopy.cpp#L84-L90
.gnu_debuglink is 14 bytes long, while PE/COFF has 8 bytes maximum for the section name. However, even if we cannot use this name, any other unique value will work just fine (e.g. .dbglink or .debug).
For DWARF this is not added by llvm-objcopy, but it is likely desired, as several tools for e.g. UEFI firmware debugging rely on at least some kind of CodeView entry to be present.
In MinGW mode LLD already generates a dummy PDB 7.0 entry: https://github.com/llvm/llvm-project/blob/8620bb9534342176ac739e2a587e4cecf437310c/lld/COFF/Writer.cpp#L1823-L1831
For non-LLVM projects, such as EDK II GenFw utility used for building UEFI firmware PE files from ELFs, it is common to add a PDB 2.0 (NB10) entry: https://github.com/tianocore/edk2/blob/b219e2c/MdePkg/Include/IndustryStandard/PeImage.h#L614
Perhaps, this can be adopted in llvm-objcopy as well.
I believe GNU objcopy for ELF also strips the path, but for convenience reasons we can make an option to keep it. E.g. --add-gnu-debuglink=/path/to/filename,/path/to/embedded/filename.