Closed jeffreyhanson closed 6 months ago
For example, here's a reprex, where I try to manually specify pseudo-absences for a PPM model. It just occurred to me - I'm assuming PPM models actually use pseudo-absences, maybe I've got this wong and they don't use pseudo-absences so my question is invalid (e.g., something like asking "what is the best way to remove the scales from a bear? what do you mean, bears don't scales...")
# load packages
library(ibis.iSDM)
# load data
bg_data <-
system.file("extdata/europegrid_50km.tif", package = "ibis.iSDM") |>
terra::rast()
spp_data <-
system.file("extdata/input_data.gpkg", package = "ibis.iSDM") |>
sf::read_sf()
env_data <-
system.file("extdata/predictors/", package = "ibis.iSDM") |>
list.files("*.tif", full.names = TRUE) |>
terra::rast()
# add pseudo-absences
psa_sett <- pseudoabs_settings(background = bg_data, nrpoints = 200, method = "random")
spp_data2 <- add_pseudoabsence(df = spp_data, field_occurrence = "Observed", settings = psa_sett)
# define model specification
model <-
distribution(bg_data) |>
add_predictors(env = env_data, transform = "scale", derivates = "none") |>
add_biodiversity_poipo(spp_data2, field_occurrence = "Observed") |>
engine_inlabru()
#> [Setup] 2024-02-02 11:55:49.360766 | Provide a background with a valid projection!
#> [Setup] 2024-02-02 11:55:49.376749 | Creating distribution object...
#> [Setup] 2024-02-02 11:55:49.41185 | Adding predictors...
#> [Setup] 2024-02-02 11:55:49.413278 | Transforming predictors...
#> [Setup] 2024-02-02 11:55:49.484272 | Adding poipo dataset...
#> [Setup] 2024-02-02 11:55:49.651038 | Absence points found. Potentially this data needs to be added as presence-absence instead?
Heya, a few things:
1) In your example you are taking a presence-only dataset and manually add pseudo-absence points to it. This changes the dataset to a presence-absence dataset and during model building the package correctly complains that there are Absence points found. If you want to manually add absence-points prior to fitting, then add the biodiversity dataset via add_biodiversity_poipa()
instead of with add_biodiversity_poipo()
2) There are some basic plotting functionalities in any BiodiversityDataset which you can simply access via x$plot()
. For example if points are added as presence-absence in your above example this looks like this
model <-
distribution(bg_data) |>
add_predictors(env = env_data, transform = "scale", derivates = "none") |>
add_biodiversity_poipa(spp_data2, field_occurrence = "Observed") |>
engine_inlabru()
model$biodiversity$plot()
3) If you need to control any pseudo-absence generation in add_biodiversity_poipo()
you could pass a specific Settings object (created with pseudoabs_settings()
there to the parameter pseudoabsence_settings
). This changes the default behaviour for sampling any pseudo-absence data throughout. INLA for example treats every single node on a mesh that as background by default for any lgcp inferences...
4) If you want to access the biodiversity data in your model object, this can be found in model$biodiversity. The respective functions for this (sorry for missing documentation still) would be to first query the id of the dataset and then return the data as sf
object.
Example:
model$biodiversity$get_data( model$biodiversity$get_ids()[[1]] )
Similar ways exist to query the point data from fitted DistributionModel objects by looking within the fit$model$biodiversity
object which contains all data used for inference.
Hope that helps.
Thanks for explaining all that - that's really helpful!
Just to clarify, if I'm using presence-only data (via add_biodiversity_poipo())
with the inlabru engine (via engine_inlabru()
), then the INLA mesh is used for the pseudo-absence points and the pseudoabsence_settings
parameter of add_biodiversity_poipo()
is ignored?
Thanks for explaining all that - that's really helpful!
Just to clarify, if I'm using presence-only data (via
add_biodiversity_poipo())
with the inlabru engine (viaengine_inlabru()
), then the INLA mesh is used for the pseudo-absence points and thepseudoabsence_settings
parameter ofadd_biodiversity_poipo()
is ignored?
For INLA So far yes (code starting here), although I think this can actually be passed on as well somehow via method stack. TBD when I have time to think about the other INLA issue.
Will report back.
For other engines this is already the default behaviour (for a Bayesian engine you could try it out with engine_breg()
and a single data type).
Brilliant - thanks! Yeah, I'm mainly interested in using INLA for the integrated modelling, so understanding how it uses the mesh and how that relates to psuedo-absences was my main question/uncertainty here. Sorry, I should have been more explicit about that in the original post.
Aye, understand. INLA has so far been the hardest to maintain thus the many changes and relatively messy code still :D
I'm working on fitting some species distribution models and I would like to be able to access the pseudo-absence data that is automatically generated when using
add_biodiversity_poipo()
. For example, I would be interested in visualizing the spatial distribution of the pseudo-absences and also using them for model evaluation. Is it possible to extract these from aBiodiversityDistribution
(output fromdistribution()
) orDistributionModel
(output fromtrain()
) object?Alternatively, if this isn't possible, is it possible to manually specify the pseudo-absence points for
add_biodiversity_poipo()
? I see that the documentation talks aboutadd_pseudoabsence()
which can be used to add pseudo-absence points to a presence-only points dataset. However, I'm not sure if such a combined point dataset can be used withadd_biodiversity_poipo()
? Although one of the vignettes shows how a dataset with presences and pseudo-absences can be used withadd_biodiversity_poipa()
, my understanding is that such an approach would mean that the modelling process treats the pseudo-absences as "true absences" -- which is not what I intend?Let me know if you'd like a reprex?