llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.11k stars 12.01k forks source link

wasm32: Slowdown with large amount of relocations (?) #37998

Closed alexcrichton closed 6 years ago

alexcrichton commented 6 years ago
Bugzilla Link 38650
Resolution FIXED
Resolved on Aug 22, 2018 10:52
Version unspecified
OS All
CC @rui314,@sbc100

Extended Description

When working with some wasm tools earlier today I realized that for one of our compilations in rust-lang/rust it takes LLD around 4 seconds to finish. I was quite surprised at this runtime because typically our linker invocations are quite speedy (under 1s most of the time, even non-incremental).

A gist 1 contains a tarball with a reproduction script and all the necessary files and there's also a direct link 2 to the tarball itself.

I do realize though that this is quite a large test case, there's ~100MB of input files which is generating ~20MB of an output. I suspect though that there may be some low hanging fruit to clear out though to improve link time here!

sbc100 commented 6 years ago

Should be fixed in rL340428

sbc100 commented 6 years ago

This is a known issue. There was an old patch out the remove the N*M alotithm that is currently in there: https://reviews.llvm.org/D42176

I was planning on making a fix to the format to avoid this issue but I didn't get around it it yet. I'll see if I can land that patch in the interim.

rui314 commented 6 years ago

Here is a gprof result. It shows that copyRelocations is extremely slow.

Each sample counts as 0.01 seconds. % cumulative self self total
time seconds seconds calls s/call s/call name
97.79 4.87 4.87 53361 0.00 0.00 lld::wasm::InputChunk::copyRelocations(llvm::object::WasmSection const&) 0.60 4.90 0.03 201290 0.00 0.00 bool llvm::DenseMapBase<llvm::DenseMap<llvm::CachedHashStringRef, lld::wasm::Symbol, llvm::DenseMapInfo, llvm::detail::DenseMapPair<llvm::CachedHashStringRef, lld::wasm::Symbol> >, llvm::CachedHashStringRef, lld::wasm::Symbol, llvm::DenseMapInfo, llvm::detail::DenseMapPair<llvm::CachedHashStringRef, lld::wasm::Symbol> >::LookupBucketFor(llvm::CachedHashStringRef const&, llvm::detail::DenseMapPair<llvm::CachedHashStringRef, lld::wasm::Symbol> const&) const 0.40 4.92 0.02 1604 0.00 0.00 llvm::object::WasmObjectFile::parseRelocSection(llvm::StringRef, llvm::object::WasmObjectFile::ReadContext&) 0.40 4.94 0.02 254 0.00 0.00 llvm::object::WasmObjectFile::parseCodeSection(llvm::object::WasmObjectFile::ReadContext&) 0.20 4.95 0.01 103851 0.00 0.00 _ZN4llvm7hashing6detail23hash_combine_range_implIKcEENSt9enable_ifIXsr16is_hashable_dataIT_EE5valueENS_9hash_codeEE4typeEPS5S9 0.20 4.96 0.01 56429 0.00 0.00 lld::wasm::InputFunction::getInputSectionOffset() const 0.20 4.97 0.01 257 0.00 0.00 llvm::object::WasmObjectFile::parseLinkingSectionSymtab(llvm::object::WasmObjectFile::ReadContext&) 0.20 4.98 0.01 12 0.00 0.00 llvm::DenseMap<llvm::CachedHashStringRef, lld::wasm::Symbol, llvm::DenseMapInfo, llvm::detail::DenseMapPair<llvm::CachedHashStringRef, lld::wasm::Symbol> >::grow(unsigned int) 0.00 4.98 0.00 158111 0.00 0.00 lld::wasm::ObjFile::calcNewValue(llvm::wasm::WasmRelocation const&) const

Essentially, when you are handling relocations in lld, you should be very careful not to waste time on each relocation, because there may be tens of millions of relocations, and literally every microsecond counts. Sam, do you have time to look at this?