Closed nedwebster closed 3 years ago
The JPMML-XGBoost library is converting XGBoost models to PMML models in a way that preserves the original decisioning logic 100%.
One of the key features of XGBoost is the ability to deal with missing values natively. So, all JPMML-XGBoost generated PMML files will also be "missing-value aware", which means that the scoring will go on until a leaf node is reached; it's not permitted to stop at some arbitrary point, and bail out with an interim value.
XGBoost models can be represented in two ways - original/non-compacted and compacted. Use the org.jpmml.xgboost.HasXGBoostOptions#OPTION_COMPACT
conversion option to choose between the two.
In the SkLearn2PMML package you can do so using the sklearn2pmml.pipeline.PMMLPipeline.configure(**pmml_options)
method:
pipeline = PMMLPipeline([
("classifier", XGBClassifier(...))
])
pipeline.fit(X, y)
# Compacted
pipeline.configure(compact = True)
sklearn2pmml(pipeline, "xgboost-compact.pmml")
# Non-compacted
pipeline.configure(compact = False)
sklearn2pmml(pipeline, "xgboost-non_compact.pmml")
Hi vruusmann,
Thank you for your speedy reply, your comment was extremely useful.
Hi,
I am trying to convert my python xgboost model to pmml, but the software calling the pmml file cannot accept 'lastPrediction' as the missingValueStrategy. Is there a way to specify the missingValueStrategy (as well as the noTrueChildStrategy) when building the pmml pipeline?
Many thanks.