apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.36k stars 3.49k forks source link

[C++][Gandiva] Constructing LLVM module with only necessary functions for better performance #40024

Open niyue opened 7 months ago

niyue commented 7 months ago

Description

This enhancement request plans to speed up the construct of LLVM module by examining the particular functions used in Gandiva expressions, and avoid unnecessary operations to speed it up.

When constructing an LLVM module for the given expressions, Gandiva performs the following tasks: 1) Instantiate a new Engine, which internally constructs a new LLVM module 2) Add many C functions and their pointers that may be called by the expression to the LLVM module.

During the above process, some of the operations are not trivial and they makes the above process not fast enough: 1) For each of the C function added to the LLVM module, in the end, the C function's pointer will be added and defined in the LLVM module's JITDylib, jit_dylib.define(llvm::orc::absoluteSymbols({{mangle(name), symbol}})). This is not a cheap operation, and since each LLVM module will add many C functions into it (143 such usage so far in the codebase), which makes constructing the LLVM module not fast enough (when cache is not hit). 2) Loading LLVM bitcode will call llvm::Linker::linkModules to copy the bitcode's module into the Engine's LLVM module, and this is an expensive operation.

Proposal

To speed up the above process, the key observation is: 1) typically, besides the internally used C functions, only a very small number of C functions are used in most expressions, so we don't have to add map the 143 functions every time (it is very rare that users will come up with some expressions calling 100+ functions at the same time) 2) typically, besides the internally used IR functions, only a very small number of IR functions are used in most expressions, we could avoid loading the LLVM bitcode and linking them into the LLVM module if the functions are not used at all (for example, all the functions used in the expressions are C functions)

The proposal to improve this part is: 1) parse the expressions and keep track of the functions used in the expressions 2) when adding/mapping C functions, if it is an internally used function, we could simply add it, otherwise, check the used functions obtained above, to see if it is really needed to be defined in the LLVM module 3) Split LLVM bitcode into two parts:

This kind of processing will avoid the expensive operations mentioned above, hence achieving better performance in some cases.

Component(s)

C++ - Gandiva

kou commented 7 months ago

We register functions lazy, right? It makes sense.

niyue commented 7 months ago

We register functions lazy, right?

Not exactly, the idea is to avoid defining non used functions if they are not used in the expressions at all.