Closed Craigdux closed 1 year ago
@andrewcparnell
Hello again. I have been using your model on some more recent data. My posterior predictive results suggest that the values range from 29 to 47% of the observations falling outside the prediction interval.
I know you stated that when 50% of the observations fall outside the prediction intervals, the model can be considered a good fit. However, are fewer observations (as above) suggest a better fit?
thanks
Hi @Craigdux.
Thanks for the message. No - it should be close to 50%. If it's much smaller than 50% then the model is over-fitting and not doing a good job. If it's much larger than 50% then the model is unable to capture the signal well and the uncertainty is inflated too much. As with all things statistical what's defined as 'larger than 50% and 'less than 50%' is hard to quantify.
As a very approximate rule of thumb If you've got very few observations I would expect it to be quite noisy and so I wouldn't worry about it too much if it's between 25-75%. If you've got 50+ observations and it's still outside (40, 60%) then I would think about what might be going wrong - perhaps some consumers eating a substantially different diet?
Hope that helps,
Andrew
@andrewcparnell
Thank you again! Now I get it!
These data are nitrate isotopes in groundwater. So, even more "messy", as the sources of nitrate overlap, as well being mixed in the groundwater.
However, you do only have two observations so it's not a particularly robust statistic. I would recommend only really using this tool if you have at least 10 observations, and preferably more.
Originally posted by @andrewcparnell in https://github.com/andrewcparnell/simmr/issues/33#issuecomment-1278837475