onnx / onnxmltools

ONNXMLTools enables conversion of models to ONNX
https://onnx.ai
Apache License 2.0
998 stars 181 forks source link

Conversion of XGBClassifier fails #513

Open tdoublep opened 2 years ago

tdoublep commented 2 years ago

With the latest version of onnxmltools, we have started seeing errors when converting XGBoost models.

Here is a simple test to reproduce:

from sklearn.datasets import make_classification
from xgboost import XGBClassifier
from onnxmltools import convert_xgboost
from onnxmltools.convert.common.data_types import FloatTensorType

X,y = make_classification(random_state=42)

clf = XGBClassifier()
clf.fit(X, y)

initial_type = [("float_input", FloatTensorType([None, X.shape[1]]))]

onnx_model = convert_xgboost(clf, initial_types=initial_type)

This is now failing with the following error:

Traceback (most recent call last):
  File "test.py", line 13, in <module>
    onnx_model = convert_xgboost(clf, initial_types=initial_type)
  File "/Users/xxx/anaconda3/envs/onnx-debug/lib/python3.8/site-packages/onnxmltools/convert/main.py", line 177, in convert_xgboost
    return convert(*args, **kwargs)
  File "/Users/xxx/anaconda3/envs/onnx-debug/lib/python3.8/site-packages/onnxmltools/convert/xgboost/convert.py", line 43, in convert
    onnx_model = convert_topology(topology, name, doc_string, target_opset, targeted_onnx)
  File "/Users/xxx/anaconda3/envs/onnx-debug/lib/python3.8/site-packages/onnxconverter_common/topology.py", line 704, in convert_topology
    raise RuntimeError(("target_opset %d is higher than the number of the installed onnx package"
RuntimeError: target_opset 15 is higher than the number of the installed onnx package or the converter support (13).

I'm running the above test in a clean Anaconda environment (Python 3.8) with the following packages installed from pip:

certifi==2021.10.8
joblib==1.1.0
numpy==1.21.3
onnx==1.10.2
onnxconverter-common==1.8.1
onnxmltools==1.10.0
protobuf==3.19.0
scikit-learn==1.0.1
scipy==1.7.1
six==1.16.0
skl2onnx==1.10.0
sklearn==0.0
threadpoolctl==3.0.0
typing-extensions==3.10.0.2
xgboost==1.5.0

Please let me know if I can provide any additional information that would be useful.

Any help much appreciated!

rgreen1995 commented 2 years ago

Having the same problem, I found by manually setting target_opset =13 it will convert the model but the probabilities of the loaded models are wrong if I use them to predict in the InferenceSession.

1138886114 commented 2 years ago

How to preprocess onnxruntime data without using xgboost library? I want to completely replace XGB. Dmatrix (input_data) of xgboost library with numpy. I don't know what to do? Thanks a lot

xadupre commented 2 years ago

You should write convert_xgboost(clf, initial_types=initial_type, target_opset=13).