rvalavi / blockCV

The blockCV package creates spatially or environmentally separated training and testing folds for cross-validation to provide a robust error estimation in spatially structured environments. See
https://doi.org/10.1111/2041-210X.13107
GNU General Public License v3.0
109 stars 24 forks source link

Error while triggering BIOMOD_Modeling function (during model fitting) in BIOMOD package using DataSplitTable option from blockCV #29

Closed rahulkgour closed 1 year ago

rahulkgour commented 1 year ago

Dear Sir,

I am receiving an error while triggering below code. Earlier when I was using the same model fitting function using biomod package without DataSplitTable, it was working fine.

I am pasting the code snippet below for your reference:

Generated block CV folds

sb_test <- cv_spatial( x = pa_data, column = "occ", r = rasters, k = 5, size = 1000, selection = "random", iteration = 50, progress = FALSE, biomod2 = TRUE )

Using blockCV in biomod2 package

library(biomod2) DataSpecies <- points myRespName <- "occ" myResp <- as.numeric(DataSpecies[,myRespName]) myRespXY <- DataSpecies[,c("x","y")] RasterValues <- terra::extract(rasters, pa_data, df = TRUE, ID = FALSE)

myBiomodData <- BIOMOD_FormatingData(resp.var = myResp, expl.var = RasterValues, resp.xy = myRespXY, resp.name = myRespName, PA.nb.rep = 1, PA.nb.absences = 1000, PA.strategy = 'random', na.rm = TRUE)

Defining the folds for DataSplitTable

DataSplitTable <- sb_test$biomod_table

Models Options using default options.

myBiomodOption <- BIOMOD_ModelingOptions()

Model fitting

myBiomodModelOut <- BIOMOD_Modeling( myBiomodData, models=c('GLM', 'RF', 'GBM', 'MAXENT.Phillips', 'ANN'), models.options=myBiomodOption, NbRunEval=1, DataSplitTable = DataSplitTable, Yweights=NULL, VarImport=3, models.eval.meth=c('ROC','TSS', 'ACCURACY'), SaveObj=TRUE, rescal.all.models=TRUE, do.full.models=FALSE, modeling.id="test") )

Error in BIOMOD_Modeling(myBiomodData, models = c("GLM", "RF", "GBM", : unused arguments (models.options = myBiomodOption, NbRunEval = 1, DataSplitTable = DataSplitTable, Yweights = NULL, VarImport = 3, models.eval.meth = c("ROC", "TSS", "ACCURACY"), SaveObj = TRUE, rescal.all.models = TRUE)

rvalavi commented 1 year ago

Hi @rahulkgour

What data are you using for this?

Can you also add the result of sessionInfo() here?

rahulkgour commented 1 year ago

Thanks Dear @rvalavi for your prompt response.

We are using fire occurrence data from 2012 to 2020 from NASA's FIRMS archives and field survey between 2019 and 2020.

As requested, I am pasting the result of sessionInfo below: > sessionInfo() R version 4.2.2 (2022-10-31) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Ventura 13.2

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] shiny_1.7.4 automap_1.0-16 tmap_3.3-3 terra_1.7-3
[5] sf_1.0-9 blockCV_3.0-1 rasterVis_0.51.5 lattice_0.20-45 [9] raster_3.6-14 sp_1.6-0 gridExtra_2.3 ggplot2_3.4.0
[13] biomod2_4.2-3

loaded via a namespace (and not attached): [1] systemfonts_1.0.4 lwgeom_0.2-11 plyr_1.8.8
[4] splines_4.2.2 crosstalk_1.2.0 leaflet_2.1.1
[7] gstat_2.1-0 usethis_2.1.6 digest_0.6.31
[10] foreach_1.5.2 htmltools_0.5.4 earth_5.3.2
[13] fansi_1.0.4 magrittr_2.0.3 memoise_2.0.1
[16] remotes_2.4.2 xts_0.12.2 prettyunits_1.1.1
[19] jpeg_0.1-10 colorspace_2.1-0 textshaping_0.3.6
[22] dplyr_1.1.0 rgdal_1.6-4 leafem_0.2.0
[25] jsonlite_1.8.4 callr_3.7.3 crayon_1.5.2
[28] hexbin_1.28.2 survival_3.4-0 zoo_1.8-11
[31] iterators_1.0.14 glue_1.6.2 stars_0.6-0
[34] PresenceAbsence_1.1.11 gtable_0.3.1 pkgbuild_1.4.0
[37] abind_1.4-5 scales_1.2.1 DBI_1.1.3
[40] miniUI_0.1.1.1 Rcpp_1.0.10 plotrix_3.8-2
[43] viridisLite_0.4.1 xtable_1.8-4 units_0.8-1
[46] foreign_0.8-83 proxy_0.4-27 Formula_1.2-4
[49] intervals_0.15.2 profvis_0.3.7 htmlwidgets_1.6.1
[52] FNN_1.1.3.1 RColorBrewer_1.1-3 wk_0.7.1
[55] ellipsis_0.3.2 urlchecker_1.0.1 pkgconfig_2.0.3
[58] reshape_0.8.9 XML_3.99-0.13 farver_2.1.1
[61] sass_0.4.5 nnet_7.3-18 deldir_1.0-6
[64] utf8_1.2.3 tidyselect_1.2.0 labeling_0.4.2
[67] rlang_1.0.6 reshape2_1.4.4 later_1.3.0
[70] tmaptools_3.1-1 munsell_0.5.0 TeachingDemos_2.12
[73] tools_4.2.2 cachem_1.0.6 cli_3.6.0
[76] generics_0.1.3 devtools_2.4.5 stringr_1.5.0
[79] fastmap_1.1.0 ragg_1.2.5 maxnet_0.1.4
[82] processx_3.8.0 leafsync_0.1.0 fs_1.6.1
[85] s2_1.1.2 purrr_1.0.1 randomForest_4.7-1.1
[88] nlme_3.1-160 mime_0.12 compiler_4.2.2
[91] rstudioapi_0.14 curl_5.0.0 png_0.1-8
[94] e1071_1.7-13 tibble_3.1.8 spacetime_1.2-8
[97] bslib_0.4.2 stringi_1.7.12 plotmo_3.6.2
[100] ps_1.7.2 desc_1.4.2 Matrix_1.5-1
[103] classInt_0.4-8 gbm_2.1.8.1 vctrs_0.5.2
[106] pillar_1.8.1 lifecycle_1.0.3 jquerylib_0.1.4
[109] data.table_1.14.6 cowplot_1.1.1 maptools_1.1-6
[112] httpuv_1.6.8 R6_2.5.1 latticeExtra_0.6-30
[115] promises_1.2.0.1 KernSmooth_2.23-20 sessioninfo_1.2.2
[118] codetools_0.2-18 dichromat_2.0-0.1 MASS_7.3-58.1
[121] pkgload_1.3.2 rprojroot_2.0.3 withr_2.5.0
[124] mgcv_1.8-41 parallel_4.2.2 grid_4.2.2
[127] rpart_4.1.19 class_7.3-20 mda_0.5-3
[130] pROC_1.18.0 base64enc_0.1-3 interp_1.1-3

rvalavi commented 1 year ago

Thanks for the report @rahulkgour. The issue is in the update of biomod package; I will update the vignette of blockCV with this new version soon. You just need to update your code with this:

# 4. Model fitting
myBiomodModelOut <- BIOMOD_Modeling(myBiomodData,
                                    models = c('GLM','MARS','GBM'),
                                    bm.options = myBiomodOption,
                                    data.split.table = DataSplitTable, # blocking folds
                                    var.import = 0,
                                    metric.eval = c('ROC'),
                                    do.full.models = TRUE,
                                    modeling.id="test")
# 5. Model evaluation
# get all models evaluation
myBiomodModelEval <- get_evaluations(myBiomodModelOut)
myBiomodModelEval[c("run", "algo", "metric.eval", "calibration", "validation")]

On a separate note; biomod method might not give you a very good result by default options; other ensembling methods predict much better when they use tuned models see:

image

You can read more at https://esajournals.onlinelibrary.wiley.com/doi/10.1002/ecm.1486 and the cited literature. And the code at https://rvalavi.github.io/Presence-Only-SDM/

The proposed ensemble method performs the best (on average) even with spatial cross-validation: https://onlinelibrary.wiley.com/doi/full/10.1111/geb.13639

rahulkgour commented 1 year ago

Dear @rvalavi, I really appreciate and grateful for your quick support.

I will update the code and re-run the whole thing again, will get back to you soon.

Actually I had a plan to go with ensemble model but before to that I was giving a trail run using blockCV, as never used blockCV package before. I am really thankful for the references you provided, those gonna be real helpful.

Regards, Rahul

rvalavi commented 1 year ago

Don't get me wrong! biomod isn't a bad model, but using it with the default setting is. I'm glad that it was helpful. I close this issue.

rahulkgour commented 1 year ago

Dear @rvalavi, I was trying for the given ensemble approach https://onlinelibrary.wiley.com/doi/full/10.1111/geb.13639

But after loading the Docker image from OSF, when I am trying to connect to the RStudio from container am getting "RStudio Initializing Error". Attached screenshot for your reference.

Screenshot 2023-02-13 at 00 59 53 Screenshot 2023-02-13 at 00 53 11