thesps / conifer

Fast inference of Boosted Decision Trees in FPGAs
Apache License 2.0
42 stars 22 forks source link

ONNX Conversion Fails with ONNX model produced from xgboost #22

Open Chriisbrown opened 2 years ago

Chriisbrown commented 2 years ago

When trying to run the onnx conversion on a model that was trained in xgboost I encounter the following error

Traceback (most recent call last): File "/home/cebrown/Documents/Tracker/NewKF/Firmware/work/src/l1tk-for-emp/tq/scripts/conifer_convert.py", line 40, in hdl_model = conifer.model(bdt_model, conifer.converters.onnx, conifer.backends.vhdl, cfg) File "/home/cebrown/anaconda3/envs/tq/lib/python3.9/site-packages/conifer/model.py", line 11, in init self._ensembleDict = converter.convert(bdt) File "/home/cebrown/anaconda3/envs/tq/lib/python3.9/site-packages/conifer/converters/onnx.py", line 29, in convert return convert_bdt(onnx_clf) File "/home/cebrown/anaconda3/envs/tq/lib/python3.9/site-packages/conifer/converters/onnx.py", line 8, in convert_bdt treelist,max_depth,base_values,no_features,no_classes=convert_graph(onnx_clf) File "/home/cebrown/anaconda3/envs/tq/lib/python3.9/site-packages/conifer/converters/onnx.py", line 32, in convert_graph if(onnx_clf.graph.node[1].name=='ZipMap'): IndexError: list index (1) out of range

Upon digging into the ONNX model itself the structure of the graph is different in the xgboost converted model, the fundamental difference seeming to be that in an sklearn converted model as is in the unit tests the header is:

ir_version: 4 producer_name: "skl2onnx" producer_version: "1.9.2" domain: "ai.onnx" model_version: 0 doc_string: ""

And for an xgboost model:

ir_version: 7 producer_name: "OnnxMLTools" producer_version: "1.7.0" domain: "onnxconverter-common" model_version: 0 doc_string: ""

It would seem that ONNX models are not created equally and the graph structure while on the whole is similar there are some key differences that break the current conversion code.