siboehm / lleaves

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.
https://lleaves.readthedocs.io/en/latest/
MIT License
333 stars 28 forks source link

How can we reduce the size of the compiled file? #30

Closed fuyw closed 1 year ago

fuyw commented 1 year ago

Hello Simon,

I tried to compile a LightGBM model using both LLeaves and TreeLite.

I found that the compiled .so file by LLeaves is ~80% larger than the files compiled by TreeLite.

I want ask that is it possible to reduce the size of the compiled .so file by LLeaves?

siboehm commented 1 year ago

Before I think about ways to make the binary smaller, may I ask why you want a smaller .so to begin with? This is not something I optimized for at all.

fuyw commented 1 year ago

Sure Simon, I totally understand that this may be a feature that we usually don't care.

In my application, I generate the models in one machine, and I need to upload them to a production machine. Sometimes the network is bad, so I would appreciate a model with a smaller size.

I am just writing to ask that are there any directions I might try to reduce the compiled model?

siboehm commented 1 year ago

Ok, that makes sense. In general you should expect that optimizing for codesize will slow down your predictions. Here's one thing that you could try: lleaves constructs an LLVM PassManager here, which runs the optimization passes: https://github.com/siboehm/lleaves/blob/359f1dc9e51171ce8495ba5a8eb74a524913bc66/lleaves/compiler/tree_compiler.py#L31-L33 You can add an extra line to set the size_level, see here for the docs: https://llvmlite.readthedocs.io/en/stable/user-guide/binding/optimization-passes.html#llvmlite.binding.PassManagerBuilder.size_level

That should bias the optimization passes towards generating code that is smaller, though I don't know how big the effect will be. You can test it, and if it works well I'll think of a way to adding a flag for size optimization to lleaves.

siboehm commented 1 year ago

I assume you're already exploiting the regular techniques for making uploading faster, like storing the model as a compressed binary and de-compressing it before loading it to memory.

fuyw commented 1 year ago

Thank you so much.

siboehm commented 1 year ago

How did you end up resolving your problem? Did any of my proposals work or did you switch to a different software?On 22. Oct 2022, at 13:38, fuyw @.***> wrote: Closed #30 as completed.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

fuyw commented 1 year ago

I found another way to minimize the compiled file. The compiled file is an object file with the type of REL (Relocatable file).

Then I just use clang++ -shared to convert it to a dynamic shared file. Then the size is reduced 40-50%.