jpmml / r2pmml

R library for converting R models to PMML
GNU Affero General Public License v3.0
73 stars 18 forks source link

Error on extracting pmml file of xgboost model #65

Closed axemixer closed 3 years ago

axemixer commented 3 years ago

Hi,

I'm trying to extract pmml of basic xgboost model but below issues are populating. I've searched all related articles but could not find workaround for this.

java.lang.ClassCastException: org.jpmml.rexp.RDoubleVector cannot be cast to org.jpmml.rexp.RIntegerVector
    at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:282)
    at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
    at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
    at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
    at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
    at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
    at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
    at org.jpmml.rexp.Main.run(Main.java:149)
    at org.jpmml.rexp.Main.main(Main.java:97)

Exception in thread "main" java.lang.ClassCastException: org.jpmml.rexp.RDoubleVector cannot be cast to org.jpmml.rexp.RIntegerVector
    at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:282)
    at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
    at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
    at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
    at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
    at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
    at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
    at org.jpmml.rexp.Main.run(Main.java:149)
    at org.jpmml.rexp.Main.main(Main.java:97)

This is my code :

library(xgboost) # v 1.2.0.1
library(r2pmml) # v 0.24.2

dev_samp = data.matrix(mtcars[,c(4,6,9)])

dev_samp[,3] = as.integer(dev_samp[,3])

set.seed(123)
bst <- xgboost(data=as.matrix(dev_samp[,1:2]), 
               label=as.integer(dev_samp[,3]),
               max_depth=2,
               eta=0.2,  
               nrounds=2,
               colsample_bytree = 0.5,
               lambda = 0.3,
               objective = "binary:logistic",
               eval_metric = "error")

fmap=data.frame(seq_along(bst$feature_names)-1, bst$feature_names, "q")
fmap[,2] = as.factor(fmap[,2])
fmap[,3] = as.factor(fmap[,3])

r2pmml(bst, "mtcars.pmml",
       fmap = fmap,
       response_name = "prediction",
       response_levels = c("0", "1"),
       missing = "")
vruusmann commented 3 years ago

The first column of the XGBoost's feature map is the integer row index/identifier.

If you convert it from floating-point (aka double) to integer, it works:

fmap=data.frame(seq_along(bst$feature_names)-1, bst$feature_names, "q")
fmap[,1] = as.integer(fmap[,1]) # THIS!
fmap[,2] = as.factor(fmap[,2])
fmap[,3] = as.factor(fmap[,3])

Typically, you should construct the feature map using one of the r2pmml::as.fmap(..) utility functions.

axemixer commented 3 years ago

Hi , thank you for quick response.

When I try your suggestion, I've another type error as below which I've also found related articles but solution did not work for my example.

java.lang.IllegalArgumentException: java.io.IOException: Expected 27-element array of zeroes, got [2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:244) at org.jpmml.rexp.XGBoostConverter.ensureLearner(XGBoostConverter.java:218) at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:80) at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70) at org.jpmml.rexp.Converter.encodePMML(Converter.java:39) at org.jpmml.rexp.Main.run(Main.java:149) at org.jpmml.rexp.Main.main(Main.java:97) Caused by: java.io.IOException: Expected 27-element array of zeroes, got [2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] at org.jpmml.xgboost.XGBoostDataInput.readReserved(XGBoostDataInput.java:179) at org.jpmml.xgboost.Learner.load(Learner.java:82) at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:95) at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:57) at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:45) at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:309) at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:242) ... 6 more

Exception in thread "main" java.lang.IllegalArgumentException: java.io.IOException: Expected 27-element array of zeroes, got [2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:244) at org.jpmml.rexp.XGBoostConverter.ensureLearner(XGBoostConverter.java:218) at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:80) at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70) at org.jpmml.rexp.Converter.encodePMML(Converter.java:39) at org.jpmml.rexp.Main.run(Main.java:149) at org.jpmml.rexp.Main.main(Main.java:97) Caused by: java.io.IOException: Expected 27-element array of zeroes, got [2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] at org.jpmml.xgboost.XGBoostDataInput.readReserved(XGBoostDataInput.java:179) at org.jpmml.xgboost.Learner.load(Learner.java:82) at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:95) at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:57) at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:45) at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:309) at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:242) ... 6 more

vruusmann commented 3 years ago

The example works with XGBoost 1.1.X (I tested with xgboost_1.1.0.1), but apparently not with 1.2.X, because they've rearranged the binary file format again.

Please follow the linked issue to be notified when XGBoost 1.2.X support lands. Should be fast.

vruusmann commented 3 years ago

@axemixer I've released R2PMML version 0.25.0, which includes an updated JPMML-XGBoost library.

After re-installing from GitHub, you should be able to work fine with xgboost_1.2.0.1 generated files.

axemixer commented 3 years ago

yes it works perfect. Thank you now I do not have to download older version of xgboost.