Open ErikRingen opened 1 week ago
I agree. I'll add a similar print statement as we have for repeated measures and distance matrices.
Sorry, I've been a bit behind on dealing with GitHub issues due to the workshop. Hoping to get to these soon.
Sounds good! And never be sorry for pace on the issues, only when you have the time/energy :)
Working on this today. @ErikRingen currently we remove cases when all coevolving variables are NA, and impute values only when they have at least one variable with observed data. Should we continue to do this when users set complete.cases = FALSE
?
Yeah I think that is a sensible implementation. For imputing values when all variables are missing, we should probably handle that in the generated quantities block (I am thinking of, for example, predicting ancestral nodes).
Okay, cool. Don't we already have predictions for ancestral nodes in the eta
parameters though?
On the latent scale yes. But not for the observation model (for non-Gaussian resps).
Yep, makes sense!
currently we remove cases when all coevolving variables are NA, and impute values only when they have at least one variable with observed data. Should we continue to do this when users set complete.cases = FALSE?
Thinking about this a bit more, I think it is cleaner to just impute all NAs, even when taxa have missing data for all coevolving variables. As a possible use case for this, users might be interested to get posterior predictions for these taxa (informed by all variables in the model), even if these taxa don't contribute to estimating the parameters of the coevolutionary process. I think the model should probably fit fine in this case. It's also cleaner to describe in the documentation ("the model imputes all missing data").
What do you think @ErikRingen? Sorry for going back on what I said before!
Current, missing data is imputed automatically and silently. Given that most statistical software does not do this by default (or at all), I think we should (1) give a message to users that missing values are imputed during model fitting and (2) add an argument named something like
complete.cases
that is FALSE by default, but if TRUE then we just perform rowwise deletion whenever there are missing values in the traits being modelled.