BayesWitnesses / m2cgen

Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
MIT License
2.82k stars 241 forks source link

Support for LightGBM Booster and XGBoost Booster #99

Open chris-smith-zocdoc opened 5 years ago

chris-smith-zocdoc commented 5 years ago

We're training our LightGBM model outside of python (spark) so we need to load it from a model file before passing it to m2c. I don't believe LightGBM can load directly into LGBMRegressor though, it must be loaded into lgb.Booster.

It would be nice if m2cgen supported lgb.Booster

Example

import lightgbm as lgb
import m2cgen as m2c

model = lgb.Booster(model_file='model.txt')

# this fails
# m2c.export_to_java(model)

# This works but is awkward 
from lightgbm.sklearn import LGBMRegressor
r = LGBMRegressor()
r._Booster = model

code = m2c.export_to_java(r)
izeigerman commented 5 years ago

Hey @chris-smith-zocdoc, thanks for reporting this!

I think support for Booster object is worth adding to m2cgen. As part of this effort I'd also suggest to add a direct Booster instance support for XGBoost models as well.

Btw, PR is very welcome if you're up to it :)

chris-smith-zocdoc commented 5 years ago

I can give it a shot, can you point me to the appropriate files that would need changed?

izeigerman commented 5 years ago

Thanks, @chris-smith-zocdoc! You can begin with the following lines: https://github.com/BayesWitnesses/m2cgen/blob/master/m2cgen/assemblers/boosting.py#L139 - for LightGBM https://github.com/BayesWitnesses/m2cgen/blob/master/m2cgen/assemblers/boosting.py#L86 - for XGBoost.

This is where we're accessing the underlying Booster instances from scikit-learn compatible wrappers. I believe we can try and check what's being passed to us - a wrapper or a Booster instance, and if it's a wrapper - retrieve the underlying Booster instance from it.

yuanjie-ai commented 4 years ago

classifier don't work

alexeymaksakov-tomtom commented 3 years ago

Also for LGBMRegressor I observe issues with operations type:

model_name = data_path + "LightGBM_model1.txt"
model = lgb.Booster(model_file=model_name)

from lightgbm.sklearn import LGBMRegressor
r = LGBMRegressor()
r._Booster = model

code = m2c.export_to_java(r)

results in

  File  " ../venv/lib/python3.8/site-packages/m2cgen/assemblers/boosting.py", line 318, in _assemble_tree
    assert op == ast.CompOpType.LTE, "Unexpected comparison op"
AssertionError: Unexpected comparison op

I've debugged it, and saw EQ operation coming from the model

alexeymaksakov-tomtom commented 3 years ago

EQ operations seem to appear only if _categoricalfeature was specified in the training paameters.

alexeymaksakov-tomtom commented 3 years ago

Also, sadly no ranking objective support

 File " ../venv/lib/python3.8/site-packages/m2cgen/assemblers/boosting.py", line 293, in _single_convert_output
    raise ValueError(
ValueError: Unsupported objective function 'lambdarank'
StrikerRUS commented 3 years ago

@alexeymaksakov-tomtom

EQ operations seem to appear only if categorical_feature was specified in the training paameters.

Yeah, you are right. Categorical features are not supported yet, unfortunately. #102

tangdiforx commented 2 years ago

when using m2cgen v0.9.0 convert pickle to js, we get the error msg below: packages/m2cgen/assemblers/[__init__.py](http://__init__.py/)", line 141, in get_assembler_cls raise NotImplementedError(f"Model '{model_name}' is not supported") NotImplementedError: Model 'xgboost_Booster' is not supported

So,xgboost booster is not supported yet?

StrikerRUS commented 2 years ago

@tangdiforx

So,xgboost booster is not supported yet?

Unfortunately no, Booster class is not supported yet.

tangdiforx commented 2 years ago

@tangdiforx

So,xgboost booster is not supported yet?

Unfortunately no, Booster class is not supported yet.

Get it and thx for your reply. Do you plan to do it ?

StrikerRUS commented 2 years ago

Yeah, I do, but unfortunately without any ETA.

mirecl commented 1 year ago

@StrikerRUS, there are plans to implement this functionality in 2023? 🙂