Closed damienrj closed 8 years ago
What's the use of the ntree_limit
argument? The PMML representation evaluates all "member" decision tree models. It doesn't support subsetting member models (or other early stopping criteria) at the moment, although this is something that could be added (eg. replacing True
segment selection predicates with <SimplePredicate name="ntree_limit" operator="lessOrEqual" value="..."/>
).
I suspect there's something wrong with your feature map file. Does the prediction work correctly when you use Scikit-Learn's XGBRegressor
and XGBClassifier
estimator types, which do not need manual feature map specification. In other words, you should try the sklearn2pmml package for exporting XGBoost models.
Finally, what is the PMML evaluation engine that is misbehaving? Is it my JPMML-Evaluator library, or is it something else?
Point 1, ntree_limit was in place because I was planning to use early_stopping. Currently it is set to it's default value of 0, and worse case I can just retrain the model with the right number of trees.
For point 3, we are using using your JPMML-Eavluator 1.3.1.
For point 2, I was using the jpmml-xgboost to convert the models because I still have errors converting XGBoost models that use XGBClassifier
. jpmml-xgboost now converts the same model without issue if I don't use the XGBClassifier class.
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
from sklearn2pmml.decoration import ContinuousDomain
from sklearn.linear_model import LogisticRegressionCV
from sklearn2pmml import sklearn2pmml
import xgboost as xgb
import pandas
import sklearn_pandas
params = {'n_estimators': 50, 'learning_rate': 1, 'seed':0, 'subsample': 0.8, 'colsample_bytree': 0.8,
'objective': 'binary:logistic', 'max_depth':4, 'min_child_weight':300, 'nthread': 50}
iris = load_iris()
iris_df = pandas.concat((pandas.DataFrame(iris.data[:, :], columns = ["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"]), pandas.DataFrame(iris.target, columns = ["Species"])), axis = 1)
iris_mapper = sklearn_pandas.DataFrameMapper([
(["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"], [ContinuousDomain(), PCA(n_components = 3)]),
("Species", None)
])
iris = iris_mapper.fit_transform(iris_df)
iris_X = iris[:, 0:3]
iris_y = iris[:, 3]
model = xgb.XGBClassifier(**params)
model.fit(iris_X, iris_y, verbose=True)
sklearn2pmml(model, iris_mapper, "test.pmml", with_repr = True, debug=True)
# iris_classifier = LogisticRegressionCV()
# iris_classifier.fit(iris_X, iris_y)
This gives the following error:
('python: ', '2.7.11')
('sklearn: ', '0.17.1')
('sklearn.externals.joblib:', '0.9.4')
('sklearn_pandas: ', '1.1.0')
('sklearn2pmml: ', '0.11.1')
java -cp /home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/guava-19.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-agent-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-schema-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar org.jpmml.sklearn.Main --pkl-estimator-input /data/tmp/damien/estimator-iEK2qc.pkl.z --repr-estimator XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=0.8,
gamma=0, learning_rate=1, max_delta_step=0, max_depth=4,
min_child_weight=300, missing=None, n_estimators=50, nthread=50,
objective='multi:softprob', reg_alpha=0, reg_lambda=1,
scale_pos_weight=1, seed=0, silent=True, subsample=0.8) --pkl-mapper-input /data/tmp/damien/mapper-nKUFWw.pkl.z --repr-mapper DataFrameMapper(features=[(['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width'], TransformerPipeline(steps=[('continuousdomain', ContinuousDomain(invalid_value_treatment='return_invalid')), ('pca', PCA(copy=True, n_components=3, whiten=False))])), ('Species', None)],
sparse=False) --pmml-output test.pmml
('Preserved joblib dump file(s): ', '/data/tmp/damien/estimator-iEK2qc.pkl.z /data/tmp/damien/mapper-nKUFWw.pkl.z')
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-10-e99243315e4f> in <module>()
28 model = xgb.XGBClassifier(**params)
29 model.fit(iris_X, iris_y, verbose=True)
---> 30 sklearn2pmml(model, iris_mapper, "test.pmml", with_repr = True, debug=True)
31
32 # iris_classifier = LogisticRegressionCV()
/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/__init__.pyc in sklearn2pmml(estimator, mapper, pmml, with_repr, debug)
63 if(debug):
64 print(" ".join(cmd))
---> 65 subprocess.check_call(cmd)
66 finally:
67 if(debug):
/home/damien/python/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
538 if cmd is None:
539 cmd = popenargs[0]
--> 540 raise CalledProcessError(retcode, cmd)
541 return 0
542
CalledProcessError: Command '['java', '-cp', '/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/guava-19.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-agent-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-schema-1.3.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar', 'org.jpmml.sklearn.Main', '--pkl-estimator-input', '/data/tmp/damien/estimator-iEK2qc.pkl.z', '--repr-estimator', "XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=0.8,\n gamma=0, learning_rate=1, max_delta_step=0, max_depth=4,\n min_child_weight=300, missing=None, n_estimators=50, nthread=50,\n objective='multi:softprob', reg_alpha=0, reg_lambda=1,\n scale_pos_weight=1, seed=0, silent=True, subsample=0.8)", '--pkl-mapper-input', '/data/tmp/damien/mapper-nKUFWw.pkl.z', '--repr-mapper', "DataFrameMapper(features=[(['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width'], TransformerPipeline(steps=[('continuousdomain', ContinuousDomain(invalid_value_treatment='return_invalid')), ('pca', PCA(copy=True, n_components=3, whiten=False))])), ('Species', None)],\n sparse=False)", '--pmml-output', 'test.pmml']' returned non-zero exit status 1
Some difference I found between a SKlearn GBM and the PMML are:
SKlearn
<DataField name="label" optype="categorical" dataType="integer">
<Value value="0"/>
<Value value="1"/>
</DataField>
<Output>
<OutputField name="probability_0" feature="probability" value="0"/>
<OutputField name="probability_1" feature="probability" value="1"/>
</Output>
XGBoost
<DataField name="label" optype="categorical" dataType="string">
<Value value="0"/>
<Value value="1"/>
</DataField>
<Output>
<OutputField name="probability_0" optype="continuous" dataType="double" feature="probability" value="0"/>
<OutputField name="probability_1" optype="continuous" dataType="double" feature="probability" value="1"/>
</Output>
('sklearn2pmml: ', '0.11.1')
Please upgrade to sklearn2pmml version 0.12.0! The last couple of releases (0.11.2 and 0.12.0) were specifically about ensuring compatibility with XGBoost 0.6.
I still get the same error with 0.12.0
. Also, we are using XGBoost (0.4a30) not 6.0 because we currently can't install 6.0 due to the complier that it needs is not supported by our version of CentOS. I have provided the updated files generated using 0.12.0
('python: ', '2.7.11')
('sklearn: ', '0.17.1')
('sklearn.externals.joblib:', '0.9.4')
('sklearn_pandas: ', '1.1.0')
('sklearn2pmml: ', '0.12.0')
java -cp /home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/guava-19.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-agent-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-schema-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar org.jpmml.sklearn.Main --pkl-estimator-input /data/tmp/damien/estimator-jceJav.pkl.z --repr-estimator XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=0.8,
gamma=0, learning_rate=1, max_delta_step=0, max_depth=4,
min_child_weight=300, missing=None, n_estimators=50, nthread=50,
objective='multi:softprob', reg_alpha=0, reg_lambda=1,
scale_pos_weight=1, seed=0, silent=True, subsample=0.8) --pkl-mapper-input /data/tmp/damien/mapper-5hehV5.pkl.z --repr-mapper DataFrameMapper(features=[(['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width'], TransformerPipeline(steps=[('continuousdomain', ContinuousDomain(invalid_value_treatment='return_invalid')), ('pca', PCA(copy=True, n_components=3, whiten=False))])), ('Species', None)],
sparse=False) --pmml-output test.pmml
('Preserved joblib dump file(s): ', '/data/tmp/damien/estimator-jceJav.pkl.z /data/tmp/damien/mapper-5hehV5.pkl.z')
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-1-e99243315e4f> in <module>()
28 model = xgb.XGBClassifier(**params)
29 model.fit(iris_X, iris_y, verbose=True)
---> 30 sklearn2pmml(model, iris_mapper, "test.pmml", with_repr = True, debug=True)
31
32 # iris_classifier = LogisticRegressionCV()
/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/__init__.pyc in sklearn2pmml(estimator, mapper, pmml, with_repr, debug)
63 if(debug):
64 print(" ".join(cmd))
---> 65 subprocess.check_call(cmd)
66 finally:
67 if(debug):
/home/damien/python/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
538 if cmd is None:
539 cmd = popenargs[0]
--> 540 raise CalledProcessError(retcode, cmd)
541 return 0
542
CalledProcessError: Command '['java', '-cp', '/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/guava-19.0.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.1.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-agent-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pmml-schema-1.3.3.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/home/damien/.local/lib/python2.7/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar', 'org.jpmml.sklearn.Main', '--pkl-estimator-input', '/data/tmp/damien/estimator-jceJav.pkl.z', '--repr-estimator', "XGBClassifier(base_score=0.5, colsample_bylevel=1, colsample_bytree=0.8,\n gamma=0, learning_rate=1, max_delta_step=0, max_depth=4,\n min_child_weight=300, missing=None, n_estimators=50, nthread=50,\n objective='multi:softprob', reg_alpha=0, reg_lambda=1,\n scale_pos_weight=1, seed=0, silent=True, subsample=0.8)", '--pkl-mapper-input', '/data/tmp/damien/mapper-5hehV5.pkl.z
', '--repr-mapper', "DataFrameMapper(features=[(['Sepal.Length', 'Sepal.Width', 'Petal.Length', 'Petal.Width'], TransformerPipeline(steps=[('continuousdomain', ContinuousDomain(invalid_value_treatment='return_invalid')), ('pca', PCA(copy=True, n_components=3, whiten=False))])), ('Species', None)],\n sparse=False)", '--pmml-output', 'test.pmml']' returned non-zero exit status 1
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Parsing DataFrameMapper PKL..
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Parsed DataFrameMapper PKL in 42 ms.
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Converting DataFrameMapper..
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Converted DataFrameMapper in 26 ms.
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Parsing Estimator PKL..
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Parsed Estimator PKL in 7 ms.
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
INFO: Converting Estimator..
Oct 14, 2016 2:52:49 PM org.jpmml.sklearn.Main run
SEVERE: Failed to convert Estimator
java.lang.RuntimeException: java.io.IOException
at xgboost.sklearn.Booster.loadLearner(Booster.java:53)
at xgboost.sklearn.Booster.getLearner(Booster.java:41)
at xgboost.sklearn.BoosterUtil.getNumberOfFeatures(BoosterUtil.java:35)
at xgboost.sklearn.XGBClassifier.getNumberOfFeatures(XGBClassifier.java:38)
at sklearn.Classifier.createSchema(Classifier.java:59)
at sklearn.EstimatorUtil.encodePMML(EstimatorUtil.java:47)
at org.jpmml.sklearn.Main.run(Main.java:189)
at org.jpmml.sklearn.Main.main(Main.java:107)
Caused by: java.io.IOException
at org.jpmml.xgboost.XGBoostDataInput.readReserved(XGBoostDataInput.java:82)
at org.jpmml.xgboost.GBTree.load(GBTree.java:61)
at org.jpmml.xgboost.Learner.load(Learner.java:98)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:34)
at xgboost.sklearn.Booster.loadLearner(Booster.java:51)
... 7 more
Exception in thread "main" java.lang.RuntimeException: java.io.IOException
at xgboost.sklearn.Booster.loadLearner(Booster.java:53)
at xgboost.sklearn.Booster.getLearner(Booster.java:41)
at xgboost.sklearn.BoosterUtil.getNumberOfFeatures(BoosterUtil.java:35)
at xgboost.sklearn.XGBClassifier.getNumberOfFeatures(XGBClassifier.java:38)
at sklearn.Classifier.createSchema(Classifier.java:59)
at sklearn.EstimatorUtil.encodePMML(EstimatorUtil.java:47)
at org.jpmml.sklearn.Main.run(Main.java:189)
at org.jpmml.sklearn.Main.main(Main.java:107)
Caused by: java.io.IOException
at org.jpmml.xgboost.XGBoostDataInput.readReserved(XGBoostDataInput.java:82)
at org.jpmml.xgboost.GBTree.load(GBTree.java:61)
at org.jpmml.xgboost.Learner.load(Learner.java:98)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:34)
at xgboost.sklearn.Booster.loadLearner(Booster.java:51)
... 7 more
Your example Python script runs just fine in my Python 2.7 environment:
('python: ', '2.7.11')
('sklearn: ', '0.18')
('sklearn.externals.joblib:', '0.10.2')
('sklearn_pandas: ', '1.1.0')
('sklearn2pmml: ', '0.12.0')
I'm using xgboost-0.6a2
that was downloaded minutes ago:
pip2.7 install --upgrade xgboost
What is the version of your Scikit-Learn's XGBoost package?
How do I check the Scikit-Learn's XGBoost package I tried both scikit-learn 17, and 18? Or do you mean the jpmml-sklearn package? I believe that is 1.1.
Also, would it be an issue if I was using weights in the XGBoost dmatrix?
Printing the version of Scikit-Learn's XGBoost package:
import xgboost
print(xgboost.__version__)
You can use row weights if you want to. Weights are used during model training; they are not part of the "persistent state" of the model, and therefore they are not transferred over to the PMML representation of the model.
However, on a practical note, I would advise you to position the weights column as the last column of XGBoost dmatrix. It may well be the case that the weight column is shifting data columns in your feature map specification file, which leads to incorrect PMML conversion results.
As for the following java.io.IOException
, then this is something that I cannot/will not fix:
Caused by: java.io.IOException
at org.jpmml.xgboost.XGBoostDataInput.readReserved(XGBoostDataInput.java:82)
at org.jpmml.xgboost.GBTree.load(GBTree.java:61)
Per the latest XGBoost source code, there has to be a 32-element array of zero bytes in that location: https://github.com/dmlc/xgboost/blob/master/src/gbm/gbtree.cc#L107
If this assumption does not hold, then there's something wrong with your XGBoost installation. XGBoost source code suggests that it might be some sort of 32-bit/64-bit compatibility issue: https://github.com/dmlc/xgboost/blob/master/src/gbm/gbtree.cc#L111
Thanks for the help, I agree it seems like there isn't much you can do with regards to the java.io.IOException
. I will check if moving the weights to the end, or removing all together fixed the issue.
Okay, it does look like there is something going on with the server, maybe the 32-bit/64-bit compatibility issue you mentioned. On the bright side I was able to get the scores validated when running on my laptop, just need to figure out the source of the problem with the server.
There's another issue about R vs. PMML mismatch: https://github.com/jpmml/jpmml-xgboost/issues/9
The above issue relates to the use of missing
argument with the xgboost()
function call. By any chance, does your use case include custom missing value indicators?
I was using the DMatrix to fill in values, but thought it might cause a problem so was filling in missing values with zeroes before I passed the data into the DMatrix.
xgtrain = xgb.DMatrix(train[clf.signal_names].values, label=train['label'].values, feature_names=clf.signal_names, weight=train.precision_weight)
xgtest = xgb.DMatrix(testing[clf.signal_names].values, label=testing['label'].values, feature_names=clf.signal_names, weight=train.precision_weight)
xgeval = xgb.DMatrix(eval[clf.signal_names].values, label=eval['label'].values, feature_names=clf.signal_names, weight=eval.precision_weight)```
Hello, I have been using jpmml-xgboost to convert models trained with XGboost 0.4a30 trained on a Centos server. The scores generated with the the boster's predict function
bst.predict(xgeval, ntree_limit=0)
are in some cases much different than those generated from the pmml version.To get the model ready for conversion I create a feature map file, and save the model with bst.save_model(). Conversion happens without any errors.
Have any suggestion behind what could be causing the difference?
Thanks!
Also, I want to mention that using jpmml-sklearn I was able to convert a GBM model into pmml, and the scores that I got were the same from the sklearn model and my pmml implementation.