mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
947 stars 85 forks source link

Extracting base model from learner generates different predictions to learner? #1192

Closed ZekeMarshall closed 1 month ago

ZekeMarshall commented 1 month ago

Hi!

I'd like to extract a model from a {mlr3} learner, however the predictions generated from the extracted model are incorrect and do not match the predictions generated from the learner. Please see a reprex below.

# Load iris data
data(iris)

# Split into training and test data
train <- iris[3:nrow(iris),]
test <- subset(iris, select = -Species)[1:2,]

# Using e1071 package only (Correct)
model_iris <- e1071::svm(Species ~ ., data = train, probability = TRUE)
pred <- predict(model_iris, test, probability = TRUE)

# Using mlr3 wrapper
lrn <- mlr3::lrn("classif.svm", predict_type = "prob")
task <- mlr3::as_task_classif(train, target = "Species")
model_mlr3_iris <- lrn$train(task = task)

# Predict test data using mlr3 learner (Correct)
model_mlr3_iris$predict_newdata(test)

# Extract e1071 model
model_mlr3_iris_ex <- model_mlr3_iris$model

# Predict test data using e1071 model (Incorrect and does not match learner predictions)
predict(model_mlr3_iris_ex, test, probability = TRUE)

I think this is probably a case of user error, but I can't figure out what's going on!

Any help would be greatly appreciated and sorry if I've missed something obvious!

Best regards,

Zeke

sebffischer commented 1 month ago

You are missing some necessary preprocessing steps on the test data before feeding into the predict function. You can check out the code here: https://github.com/mlr-org/mlr3learners/blob/b58ba3b0a529bb39ebfdaee3db81d3229b3f09ad/R/LearnerClassifSVM.R#L71-L84

ZekeMarshall commented 1 month ago

Hi @sebffischer , thanks for pointing this out! it was just a case of re-ordering the variables to match the support vectors. Thanks again!