Open laurawhipple opened 1 year ago
Hey Laura, thank you for your question and it's great to hear that you're using the package. I can't say I've come across the issue with WAIC exactly, although model estimates for the left out model won't be as great as the full model since a chunk of data is missing. Do you mind please sending me some code and data so that I can have a closer look at it?
Hi Philip,
Apologies for the significant delay. I'm finally revisiting this package after a while and I'm still running into issues with nonsense model outputs, but I am now also experiencing it with the provided examples. I have not yet tried the model that was originally giving me trouble with the updated package. I am using the most recent version of PointedSDMs, R, and the November 26th 2023 testing version of INLA. Here is the output that I got from running the basic Tinamou example provided in the README:
**Time used: Pre = 2.06, Running = 167, Post = 0.492, Total = 170 Fixed effects: mean sd 0.025quant 0.5quant 0.975quant mode kld NPP 0.000 0.000 0.000 0.000 0.000 0.000 0 eBird_intercept 0.000 0.000 0.000 0.000 0.000 0.000 0 Parks_intercept -0.668 0.342 -1.337 -0.668 0.002 -0.668 0 Gbif_intercept 0.000 0.000 0.000 0.000 0.000 0.000 0
Random effects: Name Model shared_spatial SPDE2 model
Model hyperparameters: mean sd 0.025quant 0.5quant 0.975quant mode Theta1 for shared_spatial -4.69 0.001 -4.69 -4.69 -4.69 -4.69 Theta2 for shared_spatial -2.03 0.001 -2.03 -2.03 -2.03 -2.03
Deviance Information Criterion (DIC) ...............: NA Deviance Information Criterion (DIC, saturated) ....: NA Effective number of parameters .....................: NA
Watanabe-Akaike information criterion (WAIC) ...: 2.29e+56 Effective number of parameters .................: 1.14e+56
Marginal log-Likelihood: 12870.83 is computed Posterior summaries for the linear predictor and the fitted values are computed (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')**
My only thought is that this is some sort of RAM issue with my computer, but any other ideas would be greatly appreciated. I can also send along anything else that may help. I will be trying out my original model shortly to see if the issue persists there as well.
Hey Laura,
That seems a bit strange -- are you running the latest github version of the package devtools::install_github('PhilipMostert/PointedSDMs)
? Otherwise do you mind please sending a session info. Don't think it should be a RAM issue since the example is quite small.
I was using the version currently available on CRAN, I have updated to the github version and am getting the same output for the example. Here is my session info, hopefully this is what you were looking for:
R version 4.3.2 (2023-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C LC_TIME=English_United States.utf8
time zone: America/Chicago tzcode source: internal
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] terra_1.7-55 PointedSDMs_1.3 R6_2.5.1 inlabru_2.10.0 fmesher_0.1.4 sf_1.0-14 INLA_23.11.26 sp_2.1-2 Matrix_1.6-1.1
loaded via a namespace (and not attached):
[1] gtable_0.3.4 ggplot2_3.4.4 raster_3.6-26 htmlwidgets_1.6.4 devtools_2.4.5 remotes_2.4.2.1 lattice_0.21-9
[8] vctrs_0.6.5 tools_4.3.2 generics_0.1.3 parallel_4.3.2 tibble_3.2.1 proxy_0.4-27 fansi_1.0.6
[15] R.oo_1.25.0 pkgconfig_2.0.3 KernSmooth_2.23-22 lifecycle_1.0.4 compiler_4.3.2 stringr_1.5.1 MatrixModels_0.5-3
[22] munsell_0.5.0 codetools_0.2-19 httpuv_1.6.13 htmltools_0.5.7 usethis_2.2.2 class_7.3-22 later_1.3.2
[29] pillar_1.9.0 urlchecker_1.0.1 R.utils_2.12.3 ellipsis_0.3.2 classInt_0.4-10 cachem_1.0.8 wk_0.9.1
[36] sessioninfo_1.2.2 mime_0.12 tidyselect_1.2.0 digest_0.6.33 stringi_1.8.3 dplyr_1.1.4 purrr_1.0.2
[43] splines_4.3.2 fastmap_1.1.1 grid_4.3.2 colorspace_2.1-0 cli_3.6.2 magrittr_2.0.3 base64enc_0.1-3
[50] pkgbuild_1.4.3 utf8_1.2.4 e1071_1.7-14 withr_2.5.2 scales_1.3.0 promises_1.2.1 R.methodsS3_1.8.2
[57] memoise_2.0.1 shiny_1.8.0 miniUI_0.1.1.1 s2_1.1.5 profvis_0.3.8 rlang_1.1.2 Rcpp_1.0.11
[64] xtable_1.8-4 glue_1.6.2 DBI_1.1.3 blockCV_3.1-3 R.devices_2.17.1 pkgload_1.3.3 rstudioapi_0.15.0
[71] plyr_1.8.9 fs_1.6.3 units_0.8-5
I've been encountering a similar issue (using latest CRAN version, R version 4.3.2, Windows 11) - using my own data I get WAIC = NaN and effective number of parameters = Inf. For the solitary tinamou example (using the shared spatial fields example code from the vignette) I get WAIC = 1e15 and effective number of parameters = 5e14, which seem like suspiciously large values to me?
Same here, getting very large and nonsensical coefs... I'm trying to see if more informative priors will help, but don't know how to set it. Can there be more documentation about the prior settings $priorsFixed
?
It also seems to me that the default precision is 0.001? Thanks.
Hi, I've added some new pc priors to the different model setups which seems to fix the NA
issue with the DIC. But the purpose was more of an illustrative example of the package in use rather than an extensive study. A lot of the issues related to this vignette could also arise from the patchy covariate data, which also doesn't cover the mesh region.
Hi Philip, thanks for taking a look at this. I've tried re-running the solitary tinamou example using your updated pc priors (fields$specifySpatial(sharedSpatial = TRUE, prior.range = c(50,0.01), prior.sigma = c(0.1, 0.01))
) but I'm getting WAIC = 2e+65 and effective number of parameters = 1e+65
Same here, getting very large and nonsensical coefs... I'm trying to see if more informative priors will help, but don't know how to set it. Can there be more documentation about the prior settings
$priorsFixed
?It also seems to me that the default precision is 0.001? Thanks.
For the fixed effects, assuming you have used intModel
to make an object called dat
you can do something like:
dat$priorsFixed(Effect = "MyEffect",
mean.linear = 0,
prec.linear = 1)
For the intercept you'll also need to specify the dataset name (as there's a separate intercept for each), e.g.:
dat$priorsFixed(Effect = "intercept",
datasetName = "PO_data",
mean.linear = 0,
prec.linear = 1)
dat$priorsFixed(Effect = "intercept",
datasetName = "PA_data",
mean.linear = 0,
prec.linear = 1)
Thanks @PhilipMostert and much understood that the examples should be general. And thanks @Peter-Stewart for the additional examples, they are working :) Just to report that on my dataset, the spatial priors seem to stabilise inference more than the fixed-effect priors.
Cheers.
Just an update on my progress with this issue - I have upgraded to the latest github version of the package, and have been looking at a range of different models to see if I can determine what is causing the issue with the WAIC calculation.
Running models using my own data on three different species, with two distinct sets of covariates (3 covariates in each set) I get:
So it seems the issue may be arising due to something about the particular combinations of species data and covariates.
Could perhaps be a numerical underflow issue, similar to that discussed here? : https://ihrke.github.io/posts/waic_stan.html
I think you are correct with the numerical underflow being the issue. Is there a specific dataset that's causing the issue? If you removed it from the model are you able to estimate the WAIC?
It doesn't seem to be confined to a single dataset - I've encountered the issue with several different species datasets, and with different sets of covariates. When I run separate models for Species A covariate set 2 (the case in which I observed WAIC = NaN above) and each of the covariates individually rather than together, I get one model in which WAIC is reasonable (857.61) and two in which the value is very large (5.24e+17 and 1.50e+15), and DIC = NA in all three models.
I posted about the issue on the R-INLA discussion forum (https://groups.google.com/g/r-inla-discussion-group/c/KAdWXxjCByk) and just received a very helpful response from Finn Lindgren. The key point from Finn's response re. this issue is that WAIC is not appropriate for use in point-process models (and that this is generally true, not specific to inla/inlabru). Therefore, I guess we can close this issue as the problems with the actual calculated value are no longer relevant?
Hello! I have been using this package to develop ISDMs with a camera trap dataset and a community observation dataset and comparing the outputs of the ISDMs with SDMs that use one dataset by using the DatasetOut function. I am continuously running into an issue where the model summary returns incredibly high covariate coefficient values and a value of "-Inf" for the WAIC value. To me, this indicates that the models are not running as they should. The only solution I have found is resaving the R objects and restarting my computer over and over until the models do not give this -Inf output, and this process has not worked for every model.
I assume that this is an INLA issue, but I haven't been able to troubleshoot this problem on my own beyond what I already mentioned. Would I be able to receive some insight into why this is happening, and perhaps how to proceed with troubleshooting? Happy to send along any additional information.