sambrilleman / rstanarm

rstanarm R package for Bayesian applied regression modeling
http://mc-stan.org/interfaces/rstanarm.html
GNU General Public License v3.0
0 stars 1 forks source link

Errors when estimating model with binary biomarkers #62

Closed jburos closed 7 years ago

jburos commented 7 years ago

Summary:

Running into the following error when estimating a model with binary longitudinal biomarkers:

Error in object$family[[m]]$linkinv(y_eta[[m]]) : 
  Argument eta must be a nonempty numeric vector

This error occurs at the end of the MCMC sampling, and seems to occur whether or not the model samples well.

Reproducible Steps:

For example, testing on simulated data:

library(rstanarm)
simdat <- rstanarm::simjm(M = 1, 
                          betaLong_intercept = log(0.1), 
                          betaLong_binary = -0.6,         
                          betaLong_continuous = -0.1,
                          betaLong_slope = 0.1,
                          b_sd = c(1, 0.05),
                          betaEvent_intercept = -9,
                          betaEvent_assoc = 0.05, 
                          family = binomial())

dataLong <- simdat %>% 
  dplyr::select(id, Z1, Z2, tij, starts_with('Yij'))

dataEvent <- simdat %>%
  dplyr::select(id, Z1, Z2, event, eventtime) %>%
  dplyr::distinct(id, .keep_all = TRUE)

stan_jm_fit <- stan_jm(formulaLong = Yij_1 ~ Z1 + Z2 + tij + (1 + tij | id),
                       dataLong = dataLong,
                       family = list(binomial),
                       time_var = 'tij',
                       formulaEvent = Surv(eventtime, event) ~ Z1 + Z2,
                       dataEvent = dataEvent,
                       assoc = 'etavalue',
                       init = 'random'
                       )

RStanARM Version:

Here still using my fork of rstanarm, as follows:

devtools::install_github('jburos/rstanarm', ref = 'fix-posterior-predict-newdata', args = '--preclean', local = TRUE)

I will test on a clean install of develop2 branch & post results here.

R Version:

> getRversion()
[1] ‘3.3.3’
jburos commented 7 years ago

Problem appears to occur on lines 66-67 of stanjm.R, which read:

y_eta <- lapply(1:M, function(m) linear_predictor.default(y_coefs[[m]], object$x[[m]], object$offset))
y_mu <- lapply(1:M, function(m) object$family[[m]]$linkinv(y_eta[[m]]))

At this point in the code, the value of object$x[[m]] is NULL; thus y_eta is a char(0)

This appears to be the same value that returns for all stan_jm fits, including guassian family for example. But in this case the linkinv function on the 0-valued y_eta is problematic.

For now I'm planning to skip this & replace y_mu with a NULL value in order to avoid this problem.

sambrilleman commented 7 years ago

Yeah, I think I'm just going to remove eta/mu/residuals from the stanjm object, at least for now.

I had stopped returning X and Z as part of the fitted model, since there isn't just one design matrix, since fitting the model requires the design matrices evaluated at the quadrature points as well. So, I thought for the moment it is better to avoid confusing the user by not returning any design matrices rather than returning all the design matrices.

I guess we can always choose to add them onto the model object at a later date, but for now I think it is better to keep it cleaner and simpler.

Let me know if you think otherwise!

If, later on, we choose to return X and Z as part of the fitted model, then we can easily add eta/mu/residuals back onto the returned object too.