markvdwiel / GRridge

R package for better prediction by use of co-data: Adaptive group-regularized ridge regression
4 stars 2 forks source link

grridge model fitting and predicting for survival data fails #2

Closed HerrMo closed 5 years ago

HerrMo commented 5 years ago

Hi,

I'm trying to get grridge working for survival data and encounter some problems when fitting the model and predicting new observation. Below you find reproducible examples. First of all, in Examples 1 and 2 there is exactly the same code to fit a GRridge model. The only thing that changes is the seed. Unfortunately, only Example 1 works, Example 2 fails and throws an error. Secondly, in Example 3, a model is fit with the same code as in Example 1, but then predicting new observations fails.

library(GRridge)
library(prodlim)

# Example 1: fitting works --------------------------------------------------------

set.seed(4) 
n_col <- 200

# random design matrix 
X <- cbind(matrix(nrow=40, ncol=n_col, data=rnorm(40*n_col)),
           matrix(nrow=40, ncol=30, data=rnorm(40*30, mean=1, sd=2)),
           matrix(nrow=40, ncol=100, data=rnorm(40*100, mean=2, sd=3)))

# grouping
blocks <- rep(1:3, times=c(n_col, 30, 100))
blocks <- lapply(1:3, function(x) which(blocks==x))

# survival outcome
ysurv <- prodlim::SimSurv(40)[, c(3,7)]

# split in test and train set
n <- sample.int(nrow(X), 2/3*nrow(X))

train_x <- t(X[n, ])
test_x <- t(X[-n, ])

# fit 
tt <- GRridge::grridge(train_x,
                       Surv(ysurv[n, 1], ysurv[n, 2]),
                       partitions = blocks,
                       standardizeX = TRUE,
                       innfold = 5,
                       selectionEN = TRUE)
# Example 2: fitting does not work ----------------------------------------------------------

# same code as above except for seed
rm(list = ls())

set.seed(16) 

n_col <- 200

# random design matrix 
X <- cbind(matrix(nrow=40, ncol=n_col, data=rnorm(40*n_col)),
           matrix(nrow=40, ncol=30, data=rnorm(40*30, mean=1, sd=2)),
           matrix(nrow=40, ncol=100, data=rnorm(40*100, mean=2, sd=3)))

# grouping
blocks <- rep(1:3, times=c(n_col, 30, 100))
blocks <- lapply(1:3, function(x) which(blocks==x))

# survival outcome
ysurv <- prodlim::SimSurv(40)[, c(3,7)]

# split in test and train set
n <- sample.int(nrow(X), 2/3*nrow(X))

train_x <- t(X[n, ])
test_x <- t(X[-n, ])

# fit 
tt <- GRridge::grridge(train_x,
                       Surv(ysurv[n, 1], ysurv[n, 2]),
                       partitions = blocks,
                       standardizeX = TRUE,
                       innfold = 5,
                       selectionEN = TRUE)

Error in qr.coef(qr(a, LAPACK = TRUE), b) : 'qr' and 'y' must have the same number of rows Error: Matrix inversion failed. Please increase lambda1 and/or lambda2

# Example 3: prediction fails ------------------------------------------------------------

# Code is the same as in Example 1 except 'standardizeX'-argument and prediction 
rm(list = ls())

set.seed(4) 

n_col <- 200

# random design matrix 
X <- cbind(matrix(nrow=40, ncol=n_col, data=rnorm(40*n_col)),
           matrix(nrow=40, ncol=30, data=rnorm(40*30, mean=1, sd=2)),
           matrix(nrow=40, ncol=100, data=rnorm(40*100, mean=2, sd=3)))

# grouping
blocks <- rep(1:3, times=c(n_col, 30, 100))
blocks <- lapply(1:3, function(x) which(blocks==x))

# survival outcome
ysurv <- prodlim::SimSurv(40)[, c(3,7)]

# split in test and train set
n <- sample.int(nrow(X), 2/3*nrow(X))

train_x <- t(X[n, ])
test_x <- t(X[-n, ])

# fit 
tt <- GRridge::grridge(train_x,
                       Surv(ysurv[n, 1], ysurv[n, 2]),
                       partitions = blocks,
                       standardizeX = FALSE, 
                       innfold = 5,
                       selectionEN = TRUE)

prd_tt <- predict.grridge(tt, test_x)

Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

The data in the examples is generated randomly due to reproducibility, but I encounter the same problems when I use real data.

Any help would be highly appreciated. Thanks

markvdwiel commented 5 years ago

Thanks for your interest. The first issue was due to an Infinite penalty, the second a genuine error. Both issues should now have been resolved. Please re-install GRridge using library(devtools) install_github("markvdwiel/GRridge")

HerrMo commented 5 years ago

Thank you very much, it is working now!