siboehm / lleaves

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.
https://lleaves.readthedocs.io/en/latest/
MIT License
333 stars 28 forks source link

Add support to pass model definition as a str #66

Closed mark-thm closed 6 months ago

mark-thm commented 6 months ago

Presently, lleaves only supports passing model definitions as files, but in our use-case we store the model definitions in S3, which requires us to copy the definition to the local machine before loading and compiling. Rather than require this additional copy, this PR updates the code to support either passing a model_file, or passing the string content of that file directly as model_str.

If there's concern about the size of the string in memory, or there's a desire to accept broader stream formats I'd be happy to update the PR to take a StringIO at the top level instead of a str.

siboehm commented 6 months ago

Thx Mark! I've gotten this request for string inputs now from multiple people, so maybe it's time to add. Thanks for implementing it. Previously I've always told be people to just use Temp files in a context manager. Would that solution work for you?

I'm not too worried about mem demand, the compiler itself already uses enormous amounts of mem and model.txt's are a few mb at most.

mark-thm commented 6 months ago

I'm currently just writing to a temp file in a context manager -- if you'd like to keep the interface simple I can stick with that. Figured this was a small-ish change so I'd throw it up but no big deal either way.

siboehm commented 6 months ago

Oh ok, if it's not a dealbreaker then I'd rather keep the interface small and add a note to the docs instead. Just keeps the codebase smaller and makes it easier for me :)