Unexpected behavior when specifying a narrow prior

AdamMS commented 2 years ago

Thank you in advance for considering my feedback.

I am wanting to fit a model where I restrict the prior on one of the longitudinal model betas.

[More specifically, I want to use the baseline value of the response as a predictor, but I want to use the response value itself in the association function. So, I would like the prior on the baseline covariate intercept term to be N(1, \epsilon).]

When I attempt to assign a narrow prior to one of the betas (we'll call it beta1):

The model converges much more slowly. Not a big surprise. In fact, it doesn't converge.
The standard deviation of the posterior samples for beta1 is indeed small, however
The posterior mean estimate for beta1 is nowhere close to the vicinity of the prior mean
Posterior estimates for several other parameters are also greatly affected (even in different longitudinal sub-models)

I have coded up an example using the competing risks vignette, which I attach as a text file for reference. The only thing I have changed from the vignette is the prior on 'year' in the prothrombin sub-model. MWE_Priors.txt

The vignette version returns the following outputs:

#> Longitudinal Outcome: log(serBilir) (family = gaussian, link = identity)
#>                   Mean  StDev    2.5%   97.5%      P   Rhat
#> (Intercept)     1.1528 0.1214  0.9134  1.3938 0.0000 1.0005
#> poly(year, 2)1 25.8670 2.8550 20.2399 31.3946 0.0000 1.0051
#> poly(year, 2)2  0.8610 2.1699 -3.4219  5.1293 0.6863 1.0018
#> drugD-penicil  -0.1588 0.1623 -0.4785  0.1566 0.3298 1.0004
#> p(,2)1         -1.8612 2.8932 -7.6179  3.7252 0.5235 1.0023
#> p(,2)2         -0.9272 2.4912 -5.7397  3.9439 0.7163 1.0000
#> sigma           0.3410 0.0153  0.3156  0.3747 0.0000 1.0041
#> 
#> Longitudinal Outcome: prothrombin (family = gaussian, link = identity)
#>                       Mean  StDev    2.5%   97.5%      P   Rhat
#> (Intercept)        10.6320 0.2094 10.2283 11.0407 0.0000 1.0000
#> year                0.2839 0.0769  0.1345  0.4360 0.0000 1.0002
#> drugD-penicil      -0.0914 0.2943 -0.6689  0.4837 0.7545 0.9999
#> year:drugD-penicil -0.0164 0.1024 -0.2184  0.1845 0.8700 1.0002
#> sigma               1.0820 0.0254  1.0367  1.1358 0.0000 1.0012

But the output from my example (using the same number of MCMC samples but with no expectation of converging any time soon) are:

Longitudinal Outcome: log(serBilir) (family = gaussian, link = identity)
                  Mean   StDev     2.5%   97.5%      P    Rhat
(Intercept)     8.4432 10.6819   0.6581 27.4133 0.0098 14.2345
poly(year, 2)1  8.1195 10.0260  -9.3468 20.8925 0.6495  6.3008
poly(year, 2)2 -1.6317  2.8288  -7.2137  4.0336 0.5223  1.1792
drugD-penicil  10.4412 15.3757  -0.6011 37.9708 0.8107 12.1046
p(,2)1          0.3375  5.1752 -10.0762  8.8311 0.8420  2.5454
p(,2)2         -1.3391  3.0385  -7.3115  4.6914 0.6405  1.1741
sigma          12.6094 17.2553   0.3212 42.7289 0.0000 15.1153

Longitudinal Outcome: prothrombin (family = gaussian, link = identity)
                        Mean   StDev      2.5%     97.5%      P   Rhat
(Intercept)           5.1614 10.2129   -7.9874   20.0370 0.7867 5.9491
year               1741.1538  0.0124 1741.1292 1741.1785 0.0000 1.0310
drugD-penicil       -12.5322 14.0377  -31.9819    6.7102 0.3557 6.7871
year:drugD-penicil   -3.8517  4.7470  -10.9456    3.4351 0.4045 5.9394
sigma              2758.4391 14.2312 2730.2376 2786.2170 0.0000 1.1861

In general, I think I understand that the sampler is not really built for this kind of application, and that the narrow variance on the beta1 prior is causing the sampler explore the parameter space very inefficiently.

I guess the key issue that sticks with me -- and the point I want to raise to your attention is: how is it that the posterior sampling strays so far from the prior in the first place? The prior on 'year' for the prothrombin submodel has a mean of 0.244 and a variance of 0.001 (precision = 1000). It makes no sense for the MCMC sampler to be exploring values near 1741. I know from shorter runs of my example model that the 'year' parameter estimate is going astray very early in the posterior sampling (in the first 110 samples), but then the prior precision takes over and limits exploration of the parameter space for 'year'. It feels like the prior is not used for an initial cycle of 100+ samples. By the time the prior precision takes over, it is too late and the MCMC sampler in the wrong region of the parameter space.

And an important related question is: how do I ever know when a prior (even a default prior) is too specific?

Again, thank you for thinking about this. I am hopeful my feedback can help improve the MCMC sampler.

drizopoulos commented 2 years ago

It works if you set the precision to a much higher value, e.g.,

Tau_prior <- jFit_CR$priors$Tau_betas_HC
largeTau  <- Tau_prior %*% diag(c(1,1,1,1,1,1,1,1e10,1,1))

jFit_CR_largeTau <- jm(CoxFit_CR, list(fm1, fm2), time_var = "year", 
                       functional_forms = CR_forms, 
                       n_iter = 25000L, n_burnin = 5000L, n_thin = 5L,
                       priors = list("Tau_betas_HC" = largeTau))

Fixing the prior mean also helps:

Tau_prior <- jFit_CR$priors$Tau_betas_HC
largeTau  <- Tau_prior %*% diag(c(1,1,1,1,1,1,1,1e10,1,1))
mean_betas_HC <- jFit_CR$priors$mean_betas_HC
mean_betas_HC[8L] <- 0

jFit_CR_largeTau <- jm(CoxFit_CR, list(fm1, fm2), time_var = "year", 
                       functional_forms = CR_forms, 
                       n_iter = 25000L, n_burnin = 5000L, n_thin = 5L,
                       priors = list("Tau_betas_HC" = largeTau,
                                     "mean_betas_HC" = mean_betas_HC))

AdamMS commented 2 years ago

The issue resolved after I updated to the latest development version of JMbayes2 (and updated R and my packages). So, something has been fixed within the last few weeks.

drizopoulos / JMbayes2

Unexpected behavior when specifying a narrow prior #41