Closed kathoffman closed 2 years ago
This is only an issue in the sl3
branches and is a bug from the port of SuperLearner to sl3. A variation check exists for the outcome regressions where if there is no variation in the outcome then only an intercept-only model is passed to the Super Learner. This check was not migrated properly from SuperLearner to sl3:
check_variation <- function(outcome, learners) {
if (sd(outcome) < .Machine$double.eps) {
return("SL.mean")
}
return(learners)
}
Fixed in sl3-devel
THANK U
Ok so some learners (EX: lrnr_glmnet
) still seem to fail if there's a rare outcome. You can see the error if you add a line setting one of the Y.6
's back to 1 and test it with lrnr_glmnet
.
library(lmtp)
library(tidyverse)
library(sl3)
sim_point_surv_constant <-
sim_point_surv %>%
mutate(Y.6 = case_when(Y.6 == 1 & Y.5 == 0 ~ 0, # modify example data so no one new gets the outcome at last time point
TRUE ~ Y.6))
sim_point_surv_constant[309,"Y.6"] <- 1
# Code modified from Example 5.1
a <- "trt"
y <- paste0("Y.", 1:6)
cens <- paste0("C.", 0:5)
baseline <- c("W1", "W2")
progressr::with_progress({
psi5.1 <- lmtp_tmle(sim_point_surv_constant, a, y, baseline, cens = cens,
shift = static_binary_on, folds = 2,
outcome_type = "survival",
learners_outcome = sl3::make_learner("Lrnr_glmnet"),
learners_trt = sl3::make_learner("Lrnr_glmnet"))
})
The error message i'm getting is:
" |====================================== | 75%Error in private$.train(subsetted_task, trained_sublearners) :
All learners in stack have failed
Error in self$compute_step() :
Error in private$.train(subsetted_task, trained_sublearners) :
All learners in stack have failed
Failed on Stack
Warning message:
In private$.train(subsetted_task, trained_sublearners) :
Lrnr_glmnet_NULL_deviance_10_1_100_TRUE failed with message: Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : one multinomial or binomial class has 1 or 0 observations; not allowed
. It will be removed from the stack"
@hoffmakl This is an issue with glmnet's internal checks. The way to prevent lmtp from failing is to include additional base learners that won't fail in this situation. For example, if you include Lrnr_glm
in the learner stacks, the procedure will succeed with warnings that Lrnr_glmnet
failed in some instances and was given weight zero.
I'm running into issues when I have a rare or non-occurring outcome at a certain time point. I think it'd be helpful if
lmtp
would automatically recognize when there are no new outcomes to predict, and would not try to run a learner that will fail. I am currently usinglmtp_0.9.1.5001
andsl3_1.4.2
.This is a reprex where I run into issues. I've changed the last time point
Y.6
from thesim_point_surv
data to be 0 instead of 1 for all rows where a new outcome has occurred (defined asY.6==1
andY.5==0
).lmtp_tmle
fails at 50% with the error message "subscript out of bounds". It does this for the default library of"Lrnr_glm"
and for any other learners I try, such as"Lrnr_mean"
.If possible, could this be fixed when 1) an outcome doesn't occur at all, like this example 2) an outcome doesn't occur within CV superlearning folds and 3) an outcome doesn't occur within cross fitting folds?