microsoft / hummingbird

Hummingbird compiles trained ML models into tensor computation for faster inference.
MIT License
3.32k stars 274 forks source link

SKLearn Model - Post transform ApplyBasePredictionPostTransform #782

Open XavierGeerinck opened 5 days ago

XavierGeerinck commented 5 days ago

Hi All!

When trying to convert an SK model to ONNX I get the below. Any idea what I can do to still get it converted?

  File "/.pyenv/versions/3.11.8/lib/python3.11/site-packages/hummingbird/ml/operator_converters/_gbdt_commons.py", line 167, in convert_gbdt_common
    raise NotImplementedError("Post transform {} not implemeneted yet".format(extra_config[constants.POST_TRANSFORM]))
NotImplementedError: Post transform <hummingbird.ml.operator_converters._tree_commons.ApplyBasePredictionPostTransform object at 0x367896b10> not implemeneted yet

Printed the extra config generated just before the POST_TRANSFORM and got:

{
    'n_features': 42, 
    'test_input': (array([[0.96146009]]), array([[0.55694969]]), ..., array([[0.38754932]]), array([[0.35920703]])), 
    'container': True, 
    'n_threads': 16, 
    'n_inputs': 42, 
    'input_names': ['F0', 'F1', 'F2', ..., 'F41'], 
    'base_prediction': Parameter containing: tensor([0.5000])
}

This is how I call it:

dummy_input = pd.DataFrame([np.random.rand(42).tolist()], columns=column_names)

# Load the MultiOutputRegressor model
model = joblib.load("Model_One.pkl")

# Use hummingbird to convert the model to ONNX
# note: the XGBRegressor model requires us to provide dummy input
# see example: https://github.com/microsoft/hummingbird/blob/main/notebooks/XGB-example.ipynb
model_onnx = convert(
    model,
    "onnx",
    dummy_input,
)

# Save the model
model_onnx.save("model.onnx")
ksaur commented 3 days ago

Hi @XavierGeerinck ! What post transform were you trying to use, if any? Was it one of the ones shown here (sigmod/tweedie/sigmoid), or something else? Are you able to share some of your SKL model code?

We may have a bug, we may have not implemented something, or it may have an onnx versioning issue.

Either way we need to fix our print statement there as object at 0x367896b10 isn't too helpful for debugging! :D