llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.64k stars 11.84k forks source link

Improve llvm-dwp performance #82861

Open petrhosek opened 8 months ago

petrhosek commented 8 months ago

llvm-dwp memory usage (and probably runtime) isn't great compared to binutils dwp. Problem is the that the existing object reading/writing abstractions aren't a great fit.

llvm-dwp uses MCObjectStreamer which buffers the whole file on a per-section basis. When the input files aren't compressed, we should be able to just keep track of which part of the input files we're going to write to the output file and then directly copy from input to output, never buffering.

MCObjectStreamer was introduced circa 2010, back when LLVM emitted assembly and relied on an external assembler. The MCObjectStreamer design made it easy to support assembly output and (direct object emission) object file output, but it doesn't work at a lower layer... MCObjectStreamer also has abstraction overhead for fixups, relaxation, etc. and the design is difficult to change to support streaming output as a secondary goal without hurting the current primary use case (generating a relocatable object file). It'd be great for LLVM to have a lower-level object writing API—usable under MCObjectStreamer, and maybe even LLD, ORC JIT, etc.

Alternative would be to use DWARFLinker which has a whole different set of tradeoffs. It could still stream output though since it's designed to work in a single pass.

petrhosek commented 8 months ago

Whichever approach we decide on, it should support compressed input and output which may affect the design, especially if we decide to use zlib or zstd streaming API.

dwblaikie commented 7 months ago

Thanks for filing this!