jpmml / jpmml-r

Java library and command-line application for converting R models to PMML
GNU Affero General Public License v3.0
33 stars 8 forks source link

Dropped Accuracy #2

Closed monkshow92 closed 7 years ago

monkshow92 commented 7 years ago

Hi! I was able to export a xgb model from R to pmml format using r2pmml. However, when imported to Java accuracy dropped. I have an accuracy around 54% on R, but getting around 30% on pmml. Do you know what could be causing this change? Thanks in advance!

vruusmann commented 7 years ago

What exactly do you mean by accuracy - wrong predictions, loss of precision, or something else? What is your objective function?

Most likely, you have provided an incorrect "feature map" definition file for the conversion engine. For example, you've used "q" column type in R and "int" column type in PMML, or vice versa.

You can see examples of correct workflows here: https://github.com/jpmml/jpmml-r/blob/master/src/test/R/xgboost.R

monkshow92 commented 7 years ago

Regarding accuracy is not only wrong predictions, but also that predictions are far away from the target's distribution. I have 6 classes, my majority class is 2, but on pmml the major predicted class is 4. I'm using "multi:softprob" as objective. And I generated the feature map like this: fmap <- data.frame( "id" = seq(from = 0, (to = ncol(train) - 3)), "name" = names(train)[-(1:2)], "type" = rep("q", ncol(train) - 2) ) I'll try with that example to see if I can fix that issue. Thanks!

monkshow92 commented 7 years ago

Hi vruusmann! I'll try that example and got the same results :( However, building the xgboost on Python gave me no problems! 👍 Therefore, I'll work with the Python implementation of xgboost. Thanks!

vruusmann commented 7 years ago

However, building the xgboost on Python gave me no problems!

Both R2PMML and SkLearn2PMML use the JPMML-XGBoost library for actual conversion work. So, if the R part wasn't working for you (but the Python part is working OK), then the error must be related to your R application code.