biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
83 stars 22 forks source link

Help with BIOMOD_Modeling() - [provide environmental raster directly as dataframe in spatio-tamporal data modeling] #404

Closed carmerlun closed 7 months ago

carmerlun commented 8 months ago

Context and question When I fit species distribution models (SDM) using biomod2 in R I need to provide the locations (XY coordinates) of the species (both the presence and pseudo-absence or background points, usually from a matrix or dataframe) and the environmental variables (which is usually available as raster layers). This way the function BIOMOD_Modeling() extracts the environmental info at the specific locations of the presence/absence points, such as:

mydataset <- BIOMOD_FormatingData(
  resp.var = Sp$PA, #response variable: the study species as presence/absence (0, 1)
  expl.var = env_info_raster, #rasters containing the environmental info but this does not cope with different time periods!
  resp.name = Sp$Species[1], #name of the species
  resp.xy = Sp[,c('x','y')], #coordinates of the response variable
)

fitted_models <- BIOMOD_Modeling(
  models = c('SRE', 'GLM','RF'),
  bm.format = mydataset,
  metric.eval   = c('TSS','ROC'),
)
#Note: The R objects 'Sp' and 'env_info_raster' were previously loaded.

But, is there a way to provide the environmental info so that the function doesn't need to do that extraction job. I need this becasue I have several periods of time and I already did the job of extracting the environmental info at the specific place and time of each observation. I also generated the random background points assigning them to different areas and times and created the corresponding dataframe with the absence/absence points and environmental info.

Can I provide directly this dataframe to either BIOMOD_FormatingData() or BIOMOD_Modeling() or another function in the biomod2 package?

Environment Information I'm using biomod2 version 4.2-4 and R version 4.2.2 running under debian bookworm.

MayaGueguen commented 7 months ago

Hello there,

Thank you for your nice issue with plenty of details :star2: :pray:

You can definitely give your data as data.frame to avoid the extraction of environmental values over the same environmental raster layers. To do so, you need to give 1 vector and 2 data.frame objects to resp.var, resp.xy and expl.var parameters respectively within the BIOMOD_FormatingData function.

Let's say for the example that you have 110 presences, 1000 pseudo-absence points that you selected yourself, and 3 environmental variables :

This way, environmental values can come from different temporal layers between points. The important thing is to keep the same order of points between the 3 parameters.

Note that you can also combine resp.var and resp.xy in one by giving a SpatVector object to resp.var if you prefer.

Hope it helps, Maya

carmerlun commented 7 months ago

Thank you so much, Maya, for your quick and neat reply. I tried your directions and the function works perfect. Best regards!