kogalur / randomForestSRC

DOCUMENTATION:
https://www.randomforestsrc.org/
GNU General Public License v3.0
113 stars 18 forks source link

Behavior of predict function in recent versions vs previous versions #228

Closed kgilpas closed 2 years ago

kgilpas commented 2 years ago

Hello,

I wanted to ask if the predict function output changed from previous versions <= 2.9.3. In the following code I fit a competing risk model with the latest version of randomforestsrc and in the last line use the predict function and find output values up to 150.CIF.3 or 150 per competing class. I recall in older versions i.e. <= 2.9.3 I would see 720.CIF.3 and up so I wanted to understand if there was a way to output more than 150 as before and would like to understand what these numbers represent ( 1.CIF.3, 2.CIF.3,.....150.CIF.3)?

Thank you ` library(randomForestSRC) library(tictoc) library(dplyr)

Create data frame

set.seed(2022) df <- data.frame(lifetime = sample(c(0:2400),size = 7400,replace = TRUE), target = sample(c(0:3),size = 7400,replace = TRUE), x1 = sample(c(0:6400),size = 7400,replace = TRUE) , x2 = sample(c(0:24700),size = 7400,replace = TRUE) , x3 = sample(c(0:12400),size = 7400,replace = TRUE), x4 = sample(c(0:1800),size = 7400,replace = TRUE), x5 = sample(as.factor(LETTERS[1:8]),size = 7400,replace = TRUE), x6 = sample(c(0:2000),size = 7400,replace = TRUE), x7 = sample(as.factor(LETTERS[1:6]),size = 7400,replace = TRUE), x8 = sample(c(0:1500),size = 7400,replace = TRUE), x9 = sample(as.factor(LETTERS[1:10]),size = 7400,replace = TRUE), x10 = sample(c(0:3500),size = 7400,replace = TRUE) )

Add missing values

df$x1[sample(c(0:7400), size=120)] <- NA df$x3[sample(c(0:7400), size=120)] <- NA df$x10[sample(c(0:7400), size=500)] <- NA df$x9[sample(c(0:7400), size=5200)] <- NA

run competing risk model

tic() hsps.obj <- rfsrc(Surv(lifetime, target) ~ x1 + x2 + x3+x4+x5+x6+x7+x8+x9+x10, data = df, nsplit = 2, nodesize = 2, mtry = 20, ntree = 1500, na.action = 'na.impute', importance = 'none', save.memory=TRUE) toc() hsps.obj

pred <- predict(hsps.obj, df,na.action = 'na.impute') pred$cif %>% as.data.frame() %>% View() `