Closed ShalvaBagaturia closed 3 years ago
Can reproduce locally.
Very interesting!
Gotcha - it's a missing value issue after all!
In your other issue (https://github.com/jpmml/sklearn2pmml/issues/303) you declare: "No missing data, no sparse / dense problem"
Yet, in your XGBRegressor
parameterization you have the following assignment: missing = 1
. This assignment means "if the training data matrix contains a 1
value, assume this cell contains a missing value instead". In other words, you didin't think so, but your XGBRegressor
is/was actually fitted with a sparse data matrix (sparse == "contains missing values").
If you delete this missing = 1
assignment, then the PMML side makes correct predictions.
BTW, I'd suggest you to switch from PyPMML to JPMML-Evaluator-Python.
Appreciate!
Looking at your data matrix (x.csv file), then I get the impression that missing values are encoded as -999
values.
If so, then you should be using the following configuration instead: XGBRegressor(missing = -999)
.
You may subscribe to the newly opened issue (#167) in order to receive a notification when the missing
attribyte support gets implemented. I believe it should happen sometimes this week already.
Thanks a lot for the upcoming update
Hello.
I face the following issue: when i make my model in Python and export it to PMML file, and load this PMML file to make prediction, i got different results. Here is an illustration:
xgboost verstion=1.4.2 sklearn2pmml verstion=0.74.4 pandas verstion=1.2.4
and i got different values in predict_from_python and predict_from_pmml.
Why this may happen?