biometrician / abe

An R package for Augmented Backward Elimination
GNU General Public License v3.0
3 stars 0 forks source link

#8: question regarding an error #8

Closed biometrician closed 1 year ago

biometrician commented 1 year ago

Maybe I do not see something obvious, but why does the second call (see below) to abe not work?

library(abe) library(Hmisc) library(shrink) library(survival)

data("GBSG", package = "shrink")

GBSG$enodes = exp(-0.12 * GBSG$posnodal) GBSG$prm.log2 <- log(GBSG$prm+1,2) GBSG$esm.log2 <- log(GBSG$esm+1,2) GBSG$tumgrad.f <- factor(GBSG$tumgrad, levels = c(2, 1, 3)) GBSG$age2<-GBSG$age**2

fit.q1 <- coxph(Surv(rfst, cens) ~ age+age2+factor(menostat)+ prm.log2 + esm.log2 + tumsize + enodes + tumgrad.f + strata(htreat), data = GBSG, x = TRUE, y = TRUE)

abe(fit.q1, GBSG, include = "htreat", active = "menostat", tau = 0.05, # this works exp.beta = TRUE, exact = TRUE, criterion = "alpha", alpha = 0.2, type.factor = "individual", verbose = FALSE)

abe(fit.q1, GBSG, include = "htreat", active = "menostat", tau = 0.05, # this does not work - The only difference is type.factor exp.beta = TRUE, exact = TRUE, criterion = "alpha", alpha = 0.2, type.factor = "factor", verbose = FALSE)

Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

rokblagus commented 1 year ago

I am quite sure that this is a bug, and I will try to fix it.

rokblagus commented 1 year ago

I think I have fixed this bug, but it required some major changes to the code, so it would be good to double-check. While correcting this bug I saw that it is possible to also adjust the code so that now we can enter polynomials of the form x+I(x^2)+I(x^3) directly in the formula which previously also did not work.

biometrician commented 1 year ago

Perfect. I will also check the code.

biometrician commented 1 year ago

Hi Rok,

Finally I have some time to work on abe again. I am working on the second example of the manuscript, where I present the option type.factor in the context of a Cox model.

I stumbled upon the above mentioned error message again.

**Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent**

When we had this issue before, I was working with a linear model. Is it possible that you did not fix the problem everywhere?

I have attached the code here:

# Load packages ---------------------------------------------------
library(shrink) # data set + shrinkage factors calculation (deepvein, shrink)
library(knitr) # for markdown (opts_chunk)
library(survival) # Cox model (coxph)
library(abe) # variable selection & model stability (abe, abe.resampling)
library(sandwich) # robust standard errors (sandwich)
library(lmtest) # robust standard errors (conftest)
library(doParallel) # parallel computing for abe.resampling

# Prepare parallel backend -------------------------------------------
# for information see e.g. https://cran.r-project.org/web/packages/doParallel/vignettes/gettingstartedParallel.pdf
#to run parallel computation on windows, using all but 2 cores

N_CORES <- detectCores()
cl <- makeCluster(N_CORES-2)
registerDoParallel(cl)
getDoParWorkers()

# Load data -------------------------------------------------------
data(deepvein)

levels(deepvein$sex) <- paste(".", levels(deepvein$sex), sep = "")
levels(deepvein$loc) <- paste(".", levels(deepvein$loc), sep = "")
levels(deepvein$fiimut) <- paste(".", levels(deepvein$fiimut), sep = "")
levels(deepvein$fvleid) <- paste(".", levels(deepvein$fvleid), sep = "")

# Set the reference level for all categorical variables------------
deepvein$sex <- relevel(deepvein$sex, ref = ".female")
deepvein$fiimut <- relevel(deepvein$fiimut, ref = ".absent")
deepvein$fvleid <- relevel(deepvein$fvleid, ref = ".absent")
deepvein$loc <- relevel(deepvein$loc, ref = ".PE")

pred <- c("sex", "age", "bmi", "log2ddim", "durther", "loc",  "fvleid", "fiimut")
pred_cont <- c("age",  "bmi", "log2ddim", "durther")

# global model
formula <-  paste("Surv(time, status) ~ cluster(pnr) +", paste(pred, collapse = "+"))
global_mod <- coxph(as.formula(formula), data = deepvein, x = TRUE)

# abe model
abe_mod <- abe(global_mod,
               data = deepvein,
               criterion = "AIC", 
               tau = 0.05,
               verbose = FALSE,
               type.factor="factor")
# This works:
stability1 <- 
  abe.resampling(global_mod, 
           data = deepvein, 
           criterion ="AIC", 
           tau = 0.05,
           type.resampling = "Wallisch2021",
           exact = TRUE,
           prop.sampling = 0.5,
           num.resamples = 10,     # 1000
           parallel = TRUE,
           seed = 8341,
           type.factor = "factor")

print(stability1)

# This does not work: the only difference if type.factor.
stability2 <- 
  abe.resampling(global_mod, 
           data = deepvein, 
           criterion ="AIC", 
           tau = 0.05,
           type.resampling = "Wallisch2021",
           exact = TRUE,
           prop.sampling = 0.5,
           num.resamples = 10,     # 1000
           parallel = TRUE,
           seed = 8340,
           type.factor = "individual")

print(stability2)

When you have some time, can you look on this?

Thanks a lot, Daniela

biometrician commented 1 year ago

Hi Rok,

please do not look at this issue at the moment. We found another problem with the print function for abe.resampling objects. We will solve this first. Maybe then the above mentioned issue does not exist any longer.

All the best, Daniela

rokblagus commented 1 year ago

I will wait, although I am quite sure that this is the same issue that we had before, and I am not expecting it could be fixed via print.

biometrician commented 1 year ago

Hi Rok,

I apologize for the wrong request yesterday. Apparently, I used a wrong branch in git and thus had old versions at some helper functions. I just double checked it again. Everything is working fine. :)

Thanks, Daniela