siboehm / lleaves

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.
https://lleaves.readthedocs.io/en/latest/
MIT License
343 stars 29 forks source link

Accept boosters as model inputs? #8

Closed lbittarello closed 2 years ago

lbittarello commented 2 years ago

Model currently requires the path to a model file. I was wondering if it'd make sense to also accept a booster. We could call to_string and save it as a temporary file or just work with the string representation directly. It'd make users' life (a little) easier.

siboehm commented 2 years ago

I can see how this would make using lleaves slightly easier, but it's a single line for the user (saving the output of Booster.model_to_string() to a tmpfile and passing that to lleaves) and it just complicates the API. We could write a code example of how to save to a tmpfile, but I don't see a big reason to put this functionality into lleaves.

lorey commented 2 years ago

Also, there's io.StringIO('foo') to get a file handle from a string thus omitting the tmp file: https://docs.python.org/3/library/io.html#text-i-o

lbittarello commented 2 years ago

io.StringIO('foo') won't work: TypeError: stat: path should be string, bytes, os.PathLike or integer, not StringIO.

We're currently doing:

import tempfile
from pathlib import Path

path = Path(tempfile.mkdtemp(), "booster.txt")

with open(path, "w+") as f:
    f.write(self.booster_.model_to_string())

lleaves_model_ = lleaves.Model(path)

which is not awful, but it isn't as practical as it could have been, especially as lleaves should only really need the string representation (I think) and saving to disk is just wasteful. 🤷