BayesWitnesses / m2cgen

Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
MIT License
2.82k stars 241 forks source link

Unable to export code from XGBoost 1.7.5 models #581

Open GuidoBartoli opened 1 year ago

GuidoBartoli commented 1 year ago

I'm using the following code to generate Python code from a XGBoost (bst is a previously trained XGBoost object)

temp_file = "temp.json"
bst.save_model(temp_file)
skclf = xgb.XGBClassifier()
skclf.load_model(temp_file)
os.remove(temp_file)
skclf.n_classes_ = skclf.classes_ = classes
code = m2c.export_to_python(clf)

This worked fine with XGBoost 1.7.1, but when I updated to 1.7.5, I received the following error:

Traceback (most recent call last):
  File "/home/bartoli/Projects/aikit/boost.py", line 516, in <module>
    code = m2c.export_to_python(clf)
  File "/home/bartoli/miniconda3/envs/ai3/lib/python3.9/site-packages/m2cgen/exporters.py", line 57, in export_to_python
    return _export(model, interpreter)
  File "/home/bartoli/miniconda3/envs/ai3/lib/python3.9/site-packages/m2cgen/exporters.py", line 459, in _export
    model_ast = assembler_cls(model).assemble()
  File "/home/bartoli/miniconda3/envs/ai3/lib/python3.9/site-packages/m2cgen/assemblers/boosting.py", line 214, in assemble
    return self.assembler.assemble()
  File "/home/bartoli/miniconda3/envs/ai3/lib/python3.9/site-packages/m2cgen/assemblers/boosting.py", line 34, in assemble
    return self._assemble_bin_class_output(self._all_estimator_params)
  File "/home/bartoli/miniconda3/envs/ai3/lib/python3.9/site-packages/m2cgen/assemblers/boosting.py", line 80, in _assemble_bin_class_output
    base_score = -math.log(1.0 / self._base_score - 1.0)
TypeError: unsupported operand type(s) for /: 'float' and 'NoneType'

Does m2cgen have support for the latest XGBoost version or do I have something to tweak inside the model to make it work like before?

Thanks

DonnieFy commented 1 year ago

meet the same error

BennyH26 commented 1 year ago

Encountering the same error. This is a breaking change for my pipeline. Any ideas here?

GuidoBartoli commented 1 year ago

Encountering the same error. This is a breaking change for my pipeline. Any ideas here?

Yep, it is breaking for me too, I hope for some updates from the developer...

GuidoBartoli commented 1 year ago

Taking a look at m2cgen/assemblers/boosting.py (line 78:80), I think the problem depends on self._base_score being None for some reason in the latest XGBoost model version, so the check if self._base_score != 0.0 is passed and the next instruction fails base_score = -math.log(1.0 / self._base_score - 1.0).

Maybe this model field has been renamed or removed, I will check it out and update this issue.

GuidoBartoli commented 1 year ago

Debugging into m2cgen code, maybe I managed to find a workaround for my case (export a native Booster to C and Python code), but I do not know if it works for other cases.

If the base_score parameter of the classifier is forced to 0 (the default value in the latest version is None), the check is passed and both Python and C code are generated.

# booster is already trained
temp_file = "temp.ubj"
booster.save_model(temp_file)
xgbclf = xgb.XGBClassifier()
xgbclf.load_model(temp_file)
os.remove(temp_file)
xgbclf.base_score = 0  # workaround
c_code = m2c.export_to_c(xgbclf)
py_code = m2c.export_to_python(xgbclf)
wyitong commented 10 months ago

Debugging into m2cgen code, maybe I managed to find a workaround for my case (export a native Booster to C and Python code), but I do not know if it works for other cases.

If the base_score parameter of the classifier is forced to 0 (the default value in the latest version is None), the check is passed and both Python and C code are generated.

# booster is already trained
temp_file = "temp.ubj"
booster.save_model(temp_file)
xgbclf = xgb.XGBClassifier()
xgbclf.load_model(temp_file)
os.remove(temp_file)
xgbclf.base_score = 0  # workaround
c_code = m2c.export_to_c(xgbclf)
py_code = m2c.export_to_python(xgbclf)

This works for me. Thanks!