Closed MislavSag closed 2 years ago
This does not really seem like a mlr3 bug to me @mllg
Bug in original R package ot problem with data?
This yields the same error:
library(MASS)
data = as.data.frame(tsk("german_credit")$data())
lda(credit_risk ~ ., data = data)
There is a problem with the data after converting the data to a matrix and dummy encoding the factors. From MASS:::lda.formula()
:
m = model.frame(data)
grouping <- model.response(m)
x <- model.matrix(Terms, m)
xint <- match("(Intercept)", colnames(x), nomatch = 0L)
if (xint > 0L)
x <- x[, -xint, drop = FALSE]
If you now look into the 44th column, grouped by credit risk, you see that there is no observation labeled with "1":
ftable(x[, 44] ~ data$credit_risk)
x[, 44] 0
data$credit_risk
good 700
bad 300
You can control the tolerance for the singular matrix detection via parameter tol
, but having a constant feature will always result in an error if you try to fit a LDA.
FWIW, you can "repair" the learner via the robustify pipeline:
library(mlr3pipelines)
learner = as_learner(ppl("robustify") %>>% lrn("classif.lda"))
learner$param_set$values$encode.method = "treatment" # otherwise we get colinear features
learner$train(task)
learner$predict(task)
Thanks for solution !
Expected Behaviour
classif.lda works as expected.
Actual Behaviour
It returns an error:
Reprex