Closed i3Jesterhead closed 4 years ago
library(mlr) learner = makeLearner("classif.xgboost", predict.type = "prob")
You're training XGBoost models using the mlr
package (an abstraction layer). Apparently, the mlr::makeLearner()
function returns a mlr-specific mlr::WrappedModel
object, not a generic xgb::xgb.Booster
object.
The list of supported model classes is given in the README file of the JPMML-R library: https://github.com/jpmml/jpmml-r/blob/master/README.md#features
As you can see, the mlr
package is not supported at the moment.
Maybe it's possible to extract the xgb.Booster
object from the WrappedModel
object, and pass it to the r2pmml::r2pmml()
function directly. Then again, I haven't studied the internals of the mlr
package yet, and could be overly optimistic here.
Thank you for clearing that up!
I will just use a generic xgb::xgb.Booster
object then.
Reopening - I have the mlr
package in my TODO list; this issue will help to increase its priority,
I found a very convincing solution to the problem!
With the function getLearnerModel(model, more.unwrap = TRUE)
it's possible to extract the underlying xgb.Booster
object from the Wrappedmodel
.
After that converting into pmml was a piece of cake
The disadvantage in unwrap using getLearnerModel(model, more.unwrap = TRUE)
is, if we have predict.type as probability, we wont get the good:bad probability percentages, we get only either 0 or 1
Is there any way to convert to pmml the mlr wrapped models
@apremgeorge You can keep your original mlr::WrappedModel
object as-is. If you want to convert the enclosed model object to PMML, then you should extract it into a separate temporary variable (instead of re-assigning the original variable).
Thanks for the reply library(pmml) rf_mod rf_pmml <- pmml(model=rf_mod) Above Code produces an error as Error in UseMethod("pmml") : no applicable method for 'pmml' applied to an object of class "c('FilterModel', 'BaseWrapperModel', 'WrappedModel')"
So I use getLearnerModel, this works, but the object randomForest gives only response, not the truth,prob.0,prob.1,response as given by rf_mod randomForest <- getLearnerModel(rf_mod, more.unwrap = TRUE) rf_pmml <- pmml(model=randomForest)
Thanks for any help
library(pmml)
@apremgeorge This issue tracker is about the r2pmml
package, not pmml
package. Please re-submit your issue to someplace else.
The issue is "no applicable method for object of class "c('FilterModel', 'BaseWrapperModel', 'WrappedModel')" in mlr wrapper
Thank you
This would indeed be nice to have. I would expect it to be straightforward in most cases, just extracting the mlr model's learner.model
attribute. (Or, in mlr3, the model
attribute.)
I wanted to point out one place where additional work would be needed. In mlr v2.16 or mlr3 with an xgboost binary classifier, in order to properly generate metrics for early stopping, the labels get switched before fitting the xgboost model: https://github.com/mlr-org/mlr/pull/2644 Extracting the underlying model and then using r2pmml then switches the final output probabilities.
E.g.,
library(mlr)
library(xgboost)
set.seed(314)
data("iris")
# make binary target
iris$Species <- as.integer(iris$Species)
iris$Species <- as.integer(abs(iris$Species - 2))
task <- makeClassifTask(data = iris, target = "Species")
xgb_learner <- makeLearner(
'classif.xgboost',
predict.type = 'prob',
par.vals = list(
objective = 'binary:logistic',
eval_metric = 'auc',
nrounds = 10
)
)
mlr_model <- train(xgb_learner, task = task)
mlr_preds0 <- predictLearner(xgb_learner, mlr_model, iris[, names(iris) != 'Species'])
mlr_preds <- predict(mlr_model, task = task)
xgb_model <- mlr_model$learner.model
dmat <- xgb.DMatrix(data = as.matrix(iris[, names(iris) != 'Species']))
xgb_preds <- predict(xgb_model, dmat)
head(mlr_preds0)
head(mlr_preds$data)
head(xgb_preds)
# here the predictions are swapped. That persists if you convert to pmml:
xgb_fmap <- r2pmml::genFMap(iris[, names(iris) != 'Species'])
r2pmml::r2pmml(xgb_model, fmap = xgb_fmap, './r2pmml-xgb-test')
I've been wondering how R people train XGBoost models (Python people have excellent Scikit-Learn wrapper classes). Seems like the mlr(3)
package is rather fashionable in these days.
This issue was fixed in https://github.com/jpmml/jpmml-r/commit/496248c908a2e8f0c4d39f26348ca40c91bcc524
There's an updated R2PMML package version 0.24.1 available in GitHub.
@bmreiniger The MLR+XGBoost example that you shared in https://github.com/jpmml/r2pmml/issues/46#issuecomment-589829633 works now; pay attention to the invert_levels
decoration:
mlr_model <- train(xgb_learner, task = task)
xgb_fmap <- r2pmml::genFMap(iris[, names(iris) != 'Species'])
r2pmml(mlr_model, "iris.pmml", fmap = xgb_fmap)
r2pmml(mlr_model, "iris-inverted.pmml", invert_levels = TRUE, fmap = xgb_fmap)
@bmreiniger The above example is about a dataset that contains only continuous features. How do you approach a mix of continuous plus categorical features in the MLR package? I'd like to expand the MLR+XGBoost integration, but it would be easier if there were some pointers about how it's normally done.
Hi @vruusmann,
I'm unsing the mlr-package for fitting a xgboost as described from i3Jesterhead above but when I tried to apply the solution postet here I realized that the funtion genFMap is no longer available in the package.
Is there any equivalent solution that works without this function?
Thanks very much for your help!
@LSym2 The function r2pmml::genFMap
has been refactored into r2pmml::as.fmap
(generic function; specializations exist for data.frame
and matrix
cases):
https://github.com/jpmml/r2pmml/blob/0.26.1/R/xgboost.R
More code examples here (not related to mlr, though): https://github.com/jpmml/jpmml-xgboost/blob/1.5.6/src/test/resources/xgboost.R
Hi @vruusmann,
ok thanks very much, it worked now!
Hi, I am trying to convert a xgboost classification model from the MLR library in R to a pmml-file.
When trying to convert the trained model I get the following error message.
Can you make any sense of the error message? The Java Version should not be the problem btw..
Thanks in advance!