harrelfe / rms

Regression Modeling Strategies
https://hbiostat.org/R/rms
Other
170 stars 48 forks source link

datadist cannot find in function environment but can fine in global environment. #119

Open lizhiwei1994 opened 1 year ago

lizhiwei1994 commented 1 year ago

Context

I have a custom function myfun1 that fits the cox model. Before fitting the model, I need to do a bit of processing on the data used to fit the model. Specifically, run two lines of code, dd = datadist(data) and options(datadist = 'dd').

If dd exists in the environment inside the function, myfun1 will report an error.

But when I output dd to the global environment, myfun2 works fine.

Question

Why does this happen?

How can I get myfun1 to run properly while keeping dd inside the function?

Reproducible code

library(survival)
library(rms)
data(cancer)

myfun1 <- function(data, x){

  x = sym(x)

  dd = datadist(data)
  options(datadist = 'dd')

  fit = rlang::inject(cph(Surv(time, status) ~ rcs(!!x), data = data))

  fit
}

myfun1(dat = lung, x = 'meal.cal')

# Error in Design(data, formula, specials = c("strat", "strata")) : 
#   dataset dd not found for options(datadist=)

myfun2 <- function(data, x){

  x = sym(x)

  dd <<- datadist(data) # Changed here compared to myfun1
  options(datadist = 'dd')

  fit = rlang::inject(cph(Surv(time, status) ~ rcs(!!x), data = data))

  fit
}

myfun2(dat = lung, x = 'meal.cal')

# Frequencies of Missing Values Due to Each Variable
# Surv(time, status)           meal.cal 
# 0                 47 
# 
# Cox Proportional Hazards Model
# 
# cph(formula = Surv(time, status) ~ rcs(meal.cal), data = data)
# 
# 
# Model Tests    Discrimination    
# Indexes    
# Obs        181    LR chi2      0.72    R2       0.004    
# Events     134    d.f.            4    R2(4,181)0.000    
# Center -0.3714    Pr(> chi2) 0.9485    R2(4,134)0.000    
# Score chi2   0.76    Dxy      0.048    
# Pr(> chi2) 0.9443