The inconsistent results with the same input data

brianstock / MixSIAR

A framework for Bayesian mixing models in R:

http://brianstock.github.io/MixSIAR/

94 stars 75 forks source link

The inconsistent results with the same input data #33

Closed mymyabc5186 closed 9 years ago

mymyabc5186 commented 9 years ago

I have run MixSIAR model several times with the same input data. However, the model predictions (such as median (50th percentile) values) were inconsistent. The Gelman diagnostic of some variables was not < 1.05. The Geweke diagnostic of some chains was not expected to be 5% outside +/-1.96. Did it mean the results of my own data were not reliable?

Many thanks!

JasonMHill commented 9 years ago

Sounds like your model hasn't reached convergence yet. I would just run longer chains/increase the burn-in as a first step to your troubleshooting. What MCMC run length are you using in the GUI?

mymyabc5186 commented 9 years ago

The MCMC run length of 'long' is used in the GUI. Then, if the model doesn't reach convergence, the model predictions couldn't be used. Could you help me what should I do the next step?

Thank you very much!

ericward-noaa commented 9 years ago

The MCMC run length should be long enough, so I suspect it's something about the data not being in agreement with the model you're trying to fit. I always recommend starting by plotting the data as a biplot. If for example the sources look like they're grouped in clusters (consumers specializing on different things), and you haven't included random effects, the model won't converge.

brianstock commented 9 years ago

Agreed, your model has probably not converged. And yes, you're right, if the model has not converged then you shouldn't use the results. When looking at the diagnostics, ALL (not some) of the variables should have Gelman diagnostics < 1.01 ideally, and definitely < 1.05.

"Long" is 300,000 - pretty long. You can try longer chains ("Very long" = 1,000,000: https://github.com/brianstock/MixSIAR/issues/17).

The data could also not make sense to the model, as @eric-ward just said.

mymyabc5186 commented 9 years ago

Thanks for your kind help! The very long chains seems much better. All of the variables have Gelman diagnostics < 1.05. Moreover, there were little differences among the results run at different times. Thanks again.