microsoft / hummingbird

Hummingbird compiles trained ML models into tensor computation for faster inference.
MIT License
3.32k stars 274 forks source link

Random forest in LightGBM #237

Open arfangeta opened 3 years ago

arfangeta commented 3 years ago

I want to clarify, now hummingbird is no support random forest in LightGBM? Is it planned?

When I convert from lgbm to onnx this model, I get an error lgb.LGBMClassifier(boosting_type='rf', n_estimators = 128, max_depth = 5, subsample = 0.3, bagging_freq = 1)

File "/venv/lib/python3.6/site-packages/hummingbird/ml/_parse.py", line 242, in thisoperator.inputs = [scope.variables[in] for in_ in input_names] KeyError: 'col_index'

interesaaat commented 3 years ago

Thanks @arfangeta for reporting this. Yes this is is supposed to work. I will look at it.

interesaaat commented 3 years ago

Hi @arfangeta, how are you calling the converter? I added a test for you code above and it works (#238).

arfangeta commented 3 years ago

import lightgbm as lgb
import numpy as np
from onnxmltools import convert_lightgbm
from onnxconverter_common.data_types import FloatTensorType
from hummingbird.ml import convert
import onnxruntime as ort

if __name__ == '__main__':
    X = np.random.rand(100, 200)
    X = np.array(X, dtype=np.float32)
    y = np.random.randint(2, size=100)
    model = lgb.LGBMClassifier(boosting_type='rf', n_estimators=128, max_depth=5, bagging_freq = 1, subsample=0.3)
    model.fit(X, y)

    initial_types = [("input", FloatTensorType([X.shape[0], X.shape[1]]))]  # Define the inputs for the ONNX
    onnx_ml_model = convert_lightgbm(model, initial_types=initial_types, target_opset=9)
    onnx_model = convert(onnx_ml_model, "onnx", X)
interesaaat commented 3 years ago

I see, you were converting to onnx. This should work too, let me look at why is returning the error.

interesaaat commented 3 years ago

I am able to reproduce this error. I am working on a fix.

interesaaat commented 3 years ago

Ok at the end the error on master was different. I have a branch here showing my progresses. Unfortunately I am blocked on onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for the node ArgMax_56:ArgMax(11). I will ask for help to the ONNX folks.

In the meantime I have 2 suggestions to unblock you:

interesaaat commented 3 years ago

The ArgMax problem was fixed in this PR. Will wait for ORT 1.5 before closing this issue.

ksaur commented 3 years ago

ORT 1.5 has been out for a few months now, so I revisited this. It's available in pypi

I pulled from main in a fresh container with pip install -e .[onnx], and i still get ORT==1.4.0.

Collecting onnxruntime>=1.0.0; extra == "onnx" (from hummingbird-ml==0.1.0)
  Downloading https://files.pythonhosted.org/packages/52/99/b6618fbf5c9dde961bc4d555ce924f0a9cf0d12b3945b7a328c1b9592d11/onnxruntime-1.4.0-cp37-cp37m-manylinux2010_x86_64.whl (4.4MB)

In setup.py, we have "onnxruntime>=1.0.0", so I'm a little surprised I didn't get the latest in Pypi. Should we pin to "onnxruntime>=1.5.0" @interesaaat ?

interesaaat commented 3 years ago

Thanks for doing this @ksaur, I completely forgot about this issue.

In setup.py, we have "onnxruntime>=1.0.0", so I'm a little surprised I didn't get the latest in Pypi. Should we pin to "onnxruntime>=1.5.0" @interesaaat ?

Yea it is strange that it didn't pull the latest ORT. And this is not even from the cache since it's a fresh container. In general I prefer to have the lowest supported version in the requirements, but I am ok with pinning to 1.5, if we cannot find any other workaround.

BTW does the test work if we use ORT 1.5?

ksaur commented 3 years ago

I learned the reason I was getting ORT==1.4.0 is that 1.5.1 requires an update to a newer version of pip. That makes me a bit cautious about pinning to onnxruntime==1.5.x just yet because I don't want to break other things...but maybe it's ok.

All of our existing tests pass with the new ORT, but I still get an error with the above code snippet.

Investigating. :)

ksaur commented 3 years ago

We have tests for boosting_type='rf' in test_lightgbm_converter.py but not test_onnxml_lightgbm_converter.py, and for this example we need the latter (to call convert_lightgbm).

In this test case i wrote, it fails with:

name: "shape_tensor"
}, 'tree_implementation': 'tree_trav', 'post_transform': <function convert_gbdt_common.<locals>.apply_sigmoid at 0x7f2ca3f7af28>}.
It usually means the pipeline being converted contains a
transformer or a predictor with no corresponding converter implemented.
Please fill an issue at https://github.com/microsoft/hummingbird.

Also, I had to add a check on the way we determine shapes in tree_ensembles.py (in the same test case above). Without checking the n_classes = t_values.shape[1] we get:

  File "/root/hummingbird/hummingbird/ml/operator_converters/onnx/tree_ensemble.py", line 190, in _get_tree_infos_from_tree_ensemble
    tree_infos, classes, post_transform = _get_tree_infos_from_onnx_ml_operator(operator)
  File "/root/hummingbird/hummingbird/ml/operator_converters/onnx/tree_ensemble.py", line 119, in _get_tree_infos_from_onnx_ml_operator
    n_classes = t_values.shape[1]
IndexError: tuple index out of range

Having not worked with LGBM before, I dug around to find more info on boosting_type='rf', but didn't find much. (pointers?) However I see that a test with this passes for torch.

Are we missing a converter in onnx/lgbm somewhere?

interesaaat commented 3 years ago

I don't think we are missing a converter. Maybe that current onnx converter get mislead from the fact that lightgbm is a gradient boosting alo but it generates a random forest model. The tree implementations are ok since the convert works if we pass as input the lightgbm model directly, so I think it is something related to how the onnxmltool lightgbm converter translates the lightgbm model in an onnx one. These problems are super tricky to solve.