onnx / onnxmltools

ONNXMLTools enables conversion of models to ONNX
https://onnx.ai
Apache License 2.0
1k stars 181 forks source link

How to convert XGBoost model with use_label_encoder=True #574

Open dkbarn opened 2 years ago

dkbarn commented 2 years ago

I have an old XGBClassifier model pickled to a file, which was created by xgboost 1.4.2, back when use_label_encoder=True was default.

It seems that onnxmltools does not support XGBClassifiers with use_label_encoder=True, so I'm trying to figure out what sequence of steps I need to follow to get this old model into a state that it will be accepted by onnxmltools.convert_xgboost.

As a first step, I have loaded the pickle file using xgboost 1.4.2, and then saved it out in JSON format:

with open(pickle_file, "rb") as f:
    xgb_classifier = pickle.load(f)
xgb_classifier.save_model(json_file)

Now I'm able to load this model using the latest version of xgboost. However, the .json file and resulting XGBClassifier still specifies use_label_encoder=True, which means it errors when passed to convert_xgboost:

xgb_classifier = xboost.XGBClassifier()
xgb_classifier.load_model(json_file)
onnx_model = onnxmltools.convert_xgboost(
    xgb_classifier,
    initial_types=[("float_input", FloatTensorType([None, len(features)]))],
)

This throws error: RuntimeError: Unable to interpret 'count', feature names should follow pattern 'f%d'.

Is there a way to modify the XGBClassifier to set use_label_encoder=False?

AnouarITI commented 1 year ago

I am having the same issue. How can we solve it?