jpmml / jpmml-xgboost

Java library and command-line application for converting XGBoost models to PMML
GNU Affero General Public License v3.0
128 stars 43 forks source link

How to generate fmap file #55

Closed gilmahler closed 3 years ago

gilmahler commented 3 years ago

Im working on a python project and would like to export xgboost model to text. (to later convert it to pmml using this guide https://github.com/jpmml/jpmml-xgboost/blob/master/README.md) What Im missing is how to generate the fmap file using python

vruusmann commented 3 years ago

The feature map is needed when converting a standalone XGBoost binary file.

The XGBoost file does not contain any feature information - it looks as if the model was trained using all continuous numeric (32-bit float) features. Feature map overrides this default configuration, and so you can have continuous integer features, categorical string features etc.

What Im missing is how to generate the fmap file using python

If you embed XGBoost estimator into a Scikit-Learn pipeline, then it's not necessary to provide additional feature map, because the converter is able to extract all the feature information automatically from the surrounding Pipeline object.

Try it wourself:

from sklearn2pmml import sklearn2pmml
from sklearn2pmml.pipeline import PMMLPipeline

pipeline = PMMLPipeline([
  ("mapper", DataFrameMapper([..])),
  ("classifier", XGBClassifier())
])

sklearn2pmml(pipeline, "xgboost.pmml")