ObjectLinkingLayer::findSymbol is O(n)

yuyichao commented 8 years ago


Bugzilla Link	27188
Resolution	FIXED
Resolved on	May 21, 2019 14:30
Version	trunk
OS	Linux
CC	@lhames,@weliveindetail

Extended Description

And therefore emitting n modules to run is O(n^2).

Ref https://github.com/JuliaLang/julia/issues/15619 also https://github.com/JuliaLang/julia/pull/15749#issuecomment-204827156

lhames commented 5 years ago

Hi Stefan,

Yep -- This is fixed as of ORCv2.

Before closing this ticket, one interesting piece of information would be how Orc v2 handles this. With m = #modules and n = #symbols (each), is the asymptotic complexity of symbol lookup still O(m*n)?

No. JITDylibs (new in ORCv2) contain hash maps of symbol names to addresses/compiler-closures. Lookup for a single symbol is now O(d), where d = #JITDylibs.

weliveindetail commented 5 years ago

This affects Orc v1 and it's unlikely to change. It may even be considered "expected behavior" right?

Before closing this ticket, one interesting piece of information would be how Orc v2 handles this. With m = #modules and n = #symbols (each), is the asymptotic complexity of symbol lookup still O(m*n)?

yuyichao commented 8 years ago

The solution is to cache symbol lookups in your JIT class. This can be done by holding a StringTable member in your class and searching this first, before falling back to lookups in the JIT.

Yes, we are doing that to work around this problem. Document the behavior and make it easier to do this correctly would be better. (Especially since AFAICT the tutorial example uses a similar naive approach).

lhames commented 8 years ago

Hi Yichao,

This is expected at the moment, but needs to be better documented, and there is room to improve infrastructure support for fast symbol lookups.

The reason that this is currently O(n) (and that we walk the objects from first added to last, rather than the other way around) is to match the usual lookup order of static linkers. The reason we don't pre-construct the symbol table by taking the first definition of a symbol as it's added is that definitions are finalized lazily the first time their address is requested (via getAddress), and we don't want to pre-build all those finalizers when most will end up being redundant.

The solution is to cache symbol lookups in your JIT class. This can be done by holding a StringTable member in your class and searching this first, before falling back to lookups in the JIT.

We can improve this further by adding a getSymbolTable method to the ObjectLinkingLayer: this would allow the entire symbol table for an object file to be added to the cached table when the module is emitted.

Cheers, Lang.

yuyichao commented 8 years ago

FWIW, it seems that inserting to the beginning instead of the end should be better? Assuming the modules are usually used in a similar order they are emitted?

llvm / llvm-project

ObjectLinkingLayer::findSymbol is O(n) #27562

Extended Description