mlr-org / mlr3learners

Recommended learners for mlr3
https://mlr3learners.mlr-org.com
GNU Lesser General Public License v3.0
91 stars 16 forks source link

CRAN version of kknn learner segfaults / predicts unreliably if k >= nrow(task) #191

Closed mb706 closed 3 years ago

mb706 commented 3 years ago
replicate(100, kknn::kknn(speed ~ dist, cars[1:3, ], cars[1:3, ], k = 7)$fitted.values)

segfaults; maybe we should check in our learner whether k is < nrow(task) and stop() otherwise.

I know we don't want to fix problems in other packages, but this

lr <- mlr3::lrn("regr.kknn", fallback = mlr3::lrn("regr.featureless"), encapsulate = c(train = "evaluate", predict = "evaluate"))
replicate(100, {set.seed(1) ; length(unique(lr$train(tsk("boston_housing")$select(c("age", "b"))$filter(1:6))$predict(tsk("boston_housing")$select(c("age", "b")))$data$response))})

is not deterministic, which is probably breaking some experiments in our group.

If the CRAN version of kknn is updated to GitHub version this would be solved as well (https://github.com/KlausVigo/kknn/issues/25).

jakob-r commented 3 years ago

Looks like there is something going to happen so we just sit it out? https://github.com/KlausVigo/kknn/commit/3becbeb277627ccb4d9e1dc96fb9c3a4d4af5933