The blockCV package creates spatially or environmentally separated training and testing folds for cross-validation to provide a robust error estimation in spatially structured environments. See
Error while triggering BIOMOD_Modeling function (during model fitting) in BIOMOD package using DataSplitTable option from blockCV #29

rahulkgour commented 1 year ago

Dear Sir,

I am receiving an error while triggering below code. Earlier when I was using the same model fitting function using biomod package without DataSplitTable, it was working fine.

I am pasting the code snippet below for your reference:

Generated block CV folds

sb_test <- cv_spatial( x = pa_data, column = "occ", r = rasters, k = 5, size = 1000, selection = "random", iteration = 50, progress = FALSE, biomod2 = TRUE )

Using blockCV in biomod2 package

library(biomod2) DataSpecies <- points myRespName <- "occ" myResp <- as.numeric(DataSpecies[,myRespName]) myRespXY <- DataSpecies[,c("x","y")] RasterValues <- terra::extract(rasters, pa_data, df = TRUE, ID = FALSE)

myBiomodData <- BIOMOD_FormatingData(resp.var = myResp, expl.var = RasterValues, resp.xy = myRespXY, resp.name = myRespName, PA.nb.rep = 1, PA.nb.absences = 1000, PA.strategy = 'random', na.rm = TRUE)

Defining the folds for DataSplitTable

DataSplitTable <- sb_test$biomod_table

Models Options using default options.

myBiomodOption <- BIOMOD_ModelingOptions()

Model fitting

myBiomodModelOut <- BIOMOD_Modeling( myBiomodData, models=c('GLM', 'RF', 'GBM', 'MAXENT.Phillips', 'ANN'), models.options=myBiomodOption, NbRunEval=1, DataSplitTable = DataSplitTable, Yweights=NULL, VarImport=3, models.eval.meth=c('ROC','TSS', 'ACCURACY'), SaveObj=TRUE, rescal.all.models=TRUE, do.full.models=FALSE, modeling.id="test") )

Error in BIOMOD_Modeling(myBiomodData, models = c("GLM", "RF", "GBM", : unused arguments (models.options = myBiomodOption, NbRunEval = 1, DataSplitTable = DataSplitTable, Yweights = NULL, VarImport = 3, models.eval.meth = c("ROC", "TSS", "ACCURACY"), SaveObj = TRUE, rescal.all.models = TRUE)

rvalavi commented 1 year ago

Hi @rahulkgour

What data are you using for this?

Can you also add the result of sessionInfo() here?

rahulkgour commented 1 year ago

Thanks Dear @rvalavi for your prompt response.

We are using fire occurrence data from 2012 to 2020 from NASA's FIRMS archives and field survey between 2019 and 2020.

As requested, I am pasting the result of sessionInfo below: > sessionInfo() R version 4.2.2 (2022-10-31) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Ventura 13.2

rvalavi commented 1 year ago

Thanks for the report @rahulkgour. The issue is in the update of biomod package; I will update the vignette of blockCV with this new version soon. You just need to update your code with this:

# 4. Model fitting
myBiomodModelOut <- BIOMOD_Modeling(myBiomodData,
                                    models = c('GLM','MARS','GBM'),
                                    bm.options = myBiomodOption,
                                    data.split.table = DataSplitTable, # blocking folds
                                    var.import = 0,
                                    metric.eval = c('ROC'),
                                    do.full.models = TRUE,
# 5. Model evaluation
# get all models evaluation
myBiomodModelEval <- get_evaluations(myBiomodModelOut)
myBiomodModelEval[c("run", "algo", "metric.eval", "calibration", "validation")]

On a separate note; biomod method might not give you a very good result by default options; other ensembling methods predict much better when they use tuned models see:


You can read more at https://esajournals.onlinelibrary.wiley.com/doi/10.1002/ecm.1486 and the cited literature. And the code at https://rvalavi.github.io/Presence-Only-SDM/

The proposed ensemble method performs the best (on average) even with spatial cross-validation: https://onlinelibrary.wiley.com/doi/full/10.1111/geb.13639

rahulkgour commented 1 year ago

Dear @rvalavi, I really appreciate and grateful for your quick support.

I will update the code and re-run the whole thing again, will get back to you soon.

Actually I had a plan to go with ensemble model but before to that I was giving a trail run using blockCV, as never used blockCV package before. I am really thankful for the references you provided, those gonna be real helpful.

rvalavi commented 1 year ago

Don't get me wrong! biomod isn't a bad model, but using it with the default setting is. I'm glad that it was helpful. I close this issue.

rahulkgour commented 1 year ago

Dear @rvalavi, I was trying for the given ensemble approach https://onlinelibrary.wiley.com/doi/full/10.1111/geb.13639

But after loading the Docker image from OSF, when I am trying to connect to the RStudio from container am getting "RStudio Initializing Error". Attached screenshot for your reference.

