Return predictions for raw_score=True

jtilly commented 3 years ago

We often fit GBMs with base margins (in LightGBM they're called init_score). We also need to supply these base margins during predict (for which LightGBM's predict method doesn't supply a convenient way). So in practice we use predict with raw_score =True, add the applicable base margin, and then apply the inverse link function.

Currently, when we compile a model with lleaves, the link function gets hard-wired into it, so we first need to undo that, then add the base margin, and then apply the inverse link function again.

Two options:

Always compile the model without link function, add a raw_score argument to predict and apply the inverse link function in Python. I don't think there's a massive performance penalty for that.
Add an option to compile models without link function and leave it to the user to deal with it.

siboehm commented 3 years ago

A hacky fix for now is to edit the model.txt, replacing objective=<your objective function> (top block) with objective=regression. Then lleaves won't add any link function and you'll get back raw scores.

Considering the options:

Implementing raw_score would be nice since it gets lleaves closer to the LightGBM interface. I probably won't implement this in Python since there is a severe performance hit for small batches but it could be a flag of the LLVM-function. This would yield an extra branch, but that branch would be well predictable, hence no perf hit (to be tested)
Just adding a compile flag would the easiest solution and I guess it's fine to burden the user with making this decision upfront.

This is mainly an API question, not an implementation one. I'll think about it for a few days and implement something.

siboehm commented 3 years ago

For now I'll add raw_score as a compilation parameter. I'll probably make it a runtime parameter at some point, but I didn't want to break the binary interface.

siboehm / lleaves

Return predictions for raw_score=True #7