Closed ManuelSpinola closed 3 years ago
Hi Manuel,
You're right that ubms gives returns poor results with this dataset. This particular slice of crossbill
has very few detections and more importantly very few sites with multiple detections. The traceplots are a mess, with the chains constantly jumping to relatively large values or one chain getting stuck. I ran this analysis independently using Stan and JAGS and got basically the same poor results. It seems like MCMC/Bayesian approaches in general just have trouble with this dataset. Note that if you use other years from crossbill
the results are fine.
A general solution is to specify narrower priors. This keeps MCMC from getting stuck at unreasonably high values. When I did this I was able to get reasonable results in Stan and JAGS that were similar to unmarked. However setting custom priors is not currently possible to do in ubms, this is a priority for me in the future.
If you are consistently seeing similar problems with your own dataset, I would try running things for much longer iterations than the default, perhaps starting with 10,000 per chain (particuarly if Rhats are poor). If that doesn't help it might be that ubms is not a good choice for your dataset, at least until it is possible to adjust the priors.
Ken
Thank you very much Ken.
I will try that.
Is there any rule of thumbs for number of detections and NAs for running occupancy models?
Manuel
El mar, 13 jul 2021 a las 8:10, Ken Kellner @.***>) escribió:
Hi Manuel,
You're right that ubms gives returns poor results with this dataset. This particular slice of crossbill has very few detections and more importantly very few sites with multiple detections. The traceplots are a mess, with the chains constantly jumping to relatively large values or one chain getting stuck. I ran this analysis independently using Stan and JAGS and got basically the same poor results. It seems like MCMC/Bayesian approaches in general just have trouble with this dataset. Note that if you use other years from crossbill the results are fine.
A general solution is to specify narrower priors. This keeps MCMC from getting stuck at unreasonably high values. When I did this I was able to get reasonable results in Stan and JAGS that were similar to unmarked. However setting custom priors is not currently possible to do in ubms, this is a priority for me in the future.
If you are consistently seeing similar problems with your own dataset, I would try running things for much longer iterations than the default, perhaps starting with 10,000 per chain (particuarly if Rhats are poor). If that doesn't help it might be that ubms is not a good choice for your dataset, at least until it is possible to adjust the priors.
Ken
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kenkellner/ubms/issues/37#issuecomment-879122249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFI3FB6NVEBMNYFJUAZSIZLTXRCMFANCNFSM5AHXUDUQ .
-- Manuel Spínola, Ph.D. Instituto Internacional en Conservación y Manejo de Vida Silvestre Universidad Nacional Apartado 1350-3000 Heredia COSTA RICA @. @.> @.*** Teléfono: (506) 8706 - 4662 Personal website: Lobito de río https://sites.google.com/site/lobitoderio/ Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/
I don't know of a rule of thumb, but I find that if the vast majority of sites have either 0 detections or 1 detection, and only a handful of sites have >1 detection, model results area likely to be poor (especially when you have many covariates). There's just no way to get a good estimate of p with so little info. This outcome is of course more likely when you only have 2-3 surveys at each site (as with crossbill
).
There's no issues with NAs specifically, except when there are so many NAs that you run into the situation above, where there are few detections.
Thank you very much Ken.
Manuel
El mar, 13 jul 2021 a las 10:37, Ken Kellner @.***>) escribió:
I don't know of a rule of thumb, but I find that if the vast majority of sites have either 0 detections or 1 detection, and only a handful of sites have >1 detection, model results area likely to be poor (especially when you have many covariates). There's just no way to get a good estimate of p with so little info. This outcome is of course more likely when you only have 2-3 surveys at each site (as with crossbill).
There's no issues with NAs specifically, except when there are so many NAs that you run into the situation above, where there are few detections.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kenkellner/ubms/issues/37#issuecomment-879236466, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFI3FB5JGDBTSFURIDMMPUDTXRTUTANCNFSM5AHXUDUQ .
-- Manuel Spínola, Ph.D. Instituto Internacional en Conservación y Manejo de Vida Silvestre Universidad Nacional Apartado 1350-3000 Heredia COSTA RICA @. @.> @.*** Teléfono: (506) 8706 - 4662 Personal website: Lobito de río https://sites.google.com/site/lobitoderio/ Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/
I've implemented custom priors, and the new default priors for stan_occu
result in more comparable estimates:
unmarked stan
psi(Int) -0.7434025 -0.6164383
psi(scale(forest)) 0.9782374 1.0761340
psi(scale(ele)) 0.5898850 0.5671107
p(Int) -0.6738667 -0.7450999
p(scale(date)) 0.5505878 0.5607567
Not in the CRAN version yet, but will be relatively soon.
Thank you very much Ken.
Manuel
El lun, 27 sept 2021 a las 11:54, Ken Kellner @.***>) escribió:
I've implemented custom priors, and the new default priors for stan_occu result in more comparable estimates:
unmarked stan
psi(Int) -0.7434025 -0.6164383 psi(scale(forest)) 0.9782374 1.0761340 psi(scale(ele)) 0.5898850 0.5671107 p(Int) -0.6738667 -0.7450999 p(scale(date)) 0.5505878 0.5607567
Not in the CRAN version yet, but will be relatively soon.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kenkellner/ubms/issues/37#issuecomment-928112920, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFI3FBY355M35QZGCZDXOK3UECVWTANCNFSM5AHXUDUQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Manuel Spínola, Ph.D. Instituto Internacional en Conservación y Manejo de Vida Silvestre Universidad Nacional Apartado 1350-3000 Heredia COSTA RICA @. @.> @.*** Teléfono: (506) 8706 - 4662 Personal website: Lobito de río https://sites.google.com/site/lobitoderio/ Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/
Hi Ken,
I am trying to run ubms model to my data but because I did not obtain similar estimates to the models with unmarked, I tested both with example data from unmarked.
The results with ubms sometimes do not have a good fit and sometimes the R-hat is not appropriate.
data(crossbill)
site_covs <- crossbill[,c("id", "ele", "forest")]
y <- crossbill[,c("det991","det992","det993")]
date <- crossbill[,c("date991","date992","date993")]
umf <- unmarkedFrameOccu(y=y, siteCovs=site_covs, obsCovs=list(date=date))
stan_global <- stan_occu(~scale(date)~scale(forest)+scale(ele), data=umf, chains=4)
um_global <- occu(~scale(date)~scale(forest)+scale(ele), data=umf)
cbind(unmarked=coef(um_global), stan=coef(stan_global))
psi(Int) -0.7434025 1.4468469 psi(scale(forest)) 0.9782374 2.4063905 psi(scale(ele)) 0.5898850 1.4289875 p(Int) -0.6738667 -0.9996905 p(scale(date)) 0.5505878 0.5685348
Any suggestion on how to reach similar results?