andrewcparnell / simmr

A stable isotope mixing model in R
https://andrewcparnell.github.io/simmr/
28 stars 8 forks source link

Prediction Intervals as Gauge of Model Fit #35

Closed Craigdux closed 1 year ago

Craigdux commented 1 year ago
    Hi @Craigdux. Thanks for the query. The idea of this table is that, if the model is doing a good job capturing the uncertainty in the data, then approximately 50% of the observations should lie outside the 50% prediction intervals. For your data this is exactly what you're getting so this would give you confidence that the model is fitting well. 

However, you do only have two observations so it's not a particularly robust statistic. I would recommend only really using this tool if you have at least 10 observations, and preferably more.

Originally posted by @andrewcparnell in https://github.com/andrewcparnell/simmr/issues/33#issuecomment-1278837475

Craigdux commented 1 year ago

@andrewcparnell

Hello again. I have been using your model on some more recent data. My posterior predictive results suggest that the values range from 29 to 47% of the observations falling outside the prediction interval.

I know you stated that when 50% of the observations fall outside the prediction intervals, the model can be considered a good fit. However, are fewer observations (as above) suggest a better fit?

thanks

andrewcparnell commented 1 year ago

Hi @Craigdux.

Thanks for the message. No - it should be close to 50%. If it's much smaller than 50% then the model is over-fitting and not doing a good job. If it's much larger than 50% then the model is unable to capture the signal well and the uncertainty is inflated too much. As with all things statistical what's defined as 'larger than 50% and 'less than 50%' is hard to quantify.

As a very approximate rule of thumb If you've got very few observations I would expect it to be quite noisy and so I wouldn't worry about it too much if it's between 25-75%. If you've got 50+ observations and it's still outside (40, 60%) then I would think about what might be going wrong - perhaps some consumers eating a substantially different diet?

Hope that helps,

Andrew

Craigdux commented 1 year ago

@andrewcparnell

Thank you again! Now I get it!

These data are nitrate isotopes in groundwater. So, even more "messy", as the sources of nitrate overlap, as well being mixed in the groundwater.