mlr-org / mlr3proba

Probabilistic Learning for mlr3
https://mlr3proba.mlr-org.com/
GNU Lesser General Public License v3.0
130 stars 20 forks source link

RCLL/distr6 bug #321

Closed bblodfon closed 1 year ago

bblodfon commented 1 year ago

I am almost certain this is from distr6, I tried to narrow it down: its an edge case with 1 event in the test set and it happens when we subset a Matdist object, see below:

library(mlr3verse)
#> Loading required package: mlr3
library(mlr3proba)

# veteran task
taskv = as_task_surv(x = survival::veteran, id = 'veteran',
  time = 'time', event = 'status')
poe = po('encode')
taskv = poe$train(list(taskv))[[1L]]
#taskv

set.seed(42)
rr = resample(task = taskv, learner = lrn('surv.coxph'),
  resampling = rsmp('cv', folds = 6))
#> INFO  [17:12:04.611] [mlr3] Applying learner 'surv.coxph' on task 'veteran' (iter 1/6)
#> INFO  [17:12:04.711] [mlr3] Applying learner 'surv.coxph' on task 'veteran' (iter 2/6)
#> INFO  [17:12:04.755] [mlr3] Applying learner 'surv.coxph' on task 'veteran' (iter 3/6)
#> INFO  [17:12:04.792] [mlr3] Applying learner 'surv.coxph' on task 'veteran' (iter 4/6)
#> INFO  [17:12:04.832] [mlr3] Applying learner 'surv.coxph' on task 'veteran' (iter 5/6)
#> INFO  [17:12:04.874] [mlr3] Applying learner 'surv.coxph' on task 'veteran' (iter 6/6)

rcll = msr('surv.rcll')
rr$score(rcll) # error
#> Error in UseMethod("as.Distribution"): no applicable method for 'as.Distribution' applied to an object of class "c('double', 'numeric')"

# RCLL code check
dt = as.data.table(rr)
prediction = dt$prediction[[1]]
# prediction$score(rcll) # error

out = rep(-99L, length(prediction$row_ids))
truth = prediction$truth
event = truth[, 2] == 1
event_times = truth[event, 1]
cens_times = truth[!event, 1]

# HERE!!!!!
!event # one TRUE only
#>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
#> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
prediction$distr[!event] # <= THIS FAILS, 1 column to subset edge case?
#> Error in UseMethod("as.Distribution"): no applicable method for 'as.Distribution' applied to an object of class "c('double', 'numeric')"

Created on 2023-02-14 with reprex v2.0.2

RaphaelS1 commented 1 year ago

Thanks! https://github.com/alan-turing-institute/distr6/pull/284