Open DavWest opened 6 years ago
Thanks for the suggestion! What we need is an efficient implementation of the CDFs of the (standard) multivariate normal and student-t distributions. The former is needed for the multivariate probit link, the latter as part of the copula for the multivariate logit-link (at least when following the approach of mvord).
Unfortunately, Stan does not include these CDFs yet (see for instance http://discourse.mc-stan.org/t/status-of-cdf-for-the-multivariate-normal-distribution/1661/2) so this issue has to wait until Stan is ready.
Thanks! It sounds to be quite difficult to implement a multivariate normal CDF in Stan and not the highest priority for the Stan team, so I was thinking about other solutions.
#current model block
model {
vector[N] mu_y1 = Xc_y1 * b_y1;
vector[N] mu_y2 = Xc_y2 * b_y2;
// priors including all constants
target += student_t_lpdf(temp_y1_Intercept | 3, 0, 10);
target += student_t_lpdf(temp_y2_Intercept | 3, 0, 10);
// likelihood including all constants
if (!prior_only) {
for (n in 1:N) {
target += ordered_logistic_lpmf(Y_y1[n] | mu_y1[n], temp_y1_Intercept);
}
for (n in 1:N) {
target += ordered_logistic_lpmf(Y_y2[n] | mu_y2[n], temp_y2_Intercept);
}
}
}
Below I created a linear predictor (mu_y) with both independent and correlated error terms. Do you think that this adds something to multivariate ordinal models and is feasible/efficient to implement?
#slightly adjusted model block
model {
vector[N] mu_y1 = Xc_y1 * b_y1 + u + e;
vector[N] mu_y2 = Xc_y2 * b_y2 + u + e;
//multivariate normal
vector[2] zero; // [2] because bivariate model, can be a larger number
vector[2] u;
matrix[2,2] Sigma;
target += multi_normal_lpdf(u | zero, Sigma);
target += ... //some prior for Sigma;
//adding errors to mu_y
vector[2] e; // [2] because bivariate model, can be a larger number
target += student_t_lpdf(e | 3, 0, 1);
// priors including all constants
//the same as above
}
I see multiple problems here.
First, why having two terms u
and e
? As far as I see, the relevant part for the residual correlation seem to be u
so what is e
doing?
Second, from a frequentist perspective, this model won't be identified (even when only including u
and not e
) so we will likely run into considerable convergence issues, if we don't specify strong priors. The vignette of mvord actually discusses different options to identify multivariate ordinal models, and the one you are proposing seems to have too many parameters.
Sure. You are right. u represents the residual correlation matrix. I based these additions on a recent paper in Statistics in Medicine, section 3 (https://dukespace.lib.duke.edu/dspace/handle/10161/15466), but now I see that they indeed put some constraints on the parameters to identify the model (page 3247 last paragraph and page 3248 first paragraph).
With regard to the multivariate normal cdf, a bivariate normal cdf is currently available in Stan (https://github.com/stan-dev/stan/issues/2356#issuecomment-322089731). Or is this too specific to implement in brms?
I guess the bivariate normal CDF is currently too specific to be worth implementing in brms.
Would it be possible to implement correlated multivariate ordinal models like done in the mvord package (https://cran.r-project.org/web/packages/mvord/vignettes/vignette_mvord.pdf)?