biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
77 stars 21 forks source link

Error in BIOMOD_EnsembleModeling - [cannot open connection when dir.name in BIOMOD_FormattingData is a path] #433

Closed hancock-da closed 1 month ago

hancock-da commented 3 months ago

Error and context I'm trying to build ENMs with biomod2 v4.2.4 for several populations within species. This means that I have a hierarchical output structure, and the dir.name in BIOMOD_FormatingData is a path to the species folder, with the output sent to each population within that folder (this should become clearer in the code below). This mostly works, however, I have noticed that when I set the dir.name="output_test/symphodus.melops", the formatting and modelling work fine, and the correct directories are created within that path, but the BIOMOD_EnsembleModeling function can now no longer find the ...mergedAlgo.predictions file it needs (see error message below).

I have tested exactly the same code but with dir.name="output_test" and everything runs smoothly, it is only when I change dir.name to "output_test/symphodus.melops" that the error is thrown.

Code used to get the error

myBiomodData <- BIOMOD_FormatingData(resp.name = "symphodus.melops.test",
                                                    dir.name = "output_test/symphodus.melops",
                                                    resp.var = samp,
                                                    expl.var = cropped_present[[g1_present$select_var]],
                                                    PA.nb.rep = 10,
                                                    PA.nb.absence = length(samp),
                                                    PA.strategy = 'random',
                                                    na.rm=TRUE,
                                                    filter.raster = TRUE)

show(myBiomodData)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= BIOMOD.formated.data -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

dir.name =  output_test/symphodus.melops

sp.name =  symphodus.melops.test

     79 presences,  0 true absences and  838 undefined points in dataset

     10 explanatory variables

  salinitymean    salinityrange       curvelmin          curvelmean         curvelmax        curvelrange         tempmin          tempmean       
 Min.   : 3.926   Min.   : 0.2067   Min.   :0.003254   Min.   :0.000606   Min.   :0.03603   Min.   :0.00003   Min.   :-1.895   Min.   :-0.06203  
 1st Qu.:34.180   1st Qu.: 0.5147   1st Qu.:0.081847   1st Qu.:0.021859   1st Qu.:0.11402   1st Qu.:0.02833   1st Qu.: 1.090   1st Qu.: 7.12719  
 Median :34.922   Median : 0.8957   Median :0.116212   Median :0.034996   Median :0.17025   Median :0.05756   Median : 4.672   Median : 9.64826  
 Mean   :32.391   Mean   : 2.4202   Mean   :0.127692   Mean   :0.051103   Mean   :0.18629   Mean   :0.08888   Mean   : 4.415   Mean   : 8.90239  
 3rd Qu.:35.157   3rd Qu.: 2.0864   3rd Qu.:0.158359   3rd Qu.:0.061754   3rd Qu.:0.22816   3rd Qu.:0.12002   3rd Qu.: 6.930   3rd Qu.:10.56483  
 Max.   :35.699   Max.   :21.4860   Max.   :0.447958   Max.   :0.300407   Max.   :0.64304   Max.   :0.56289   Max.   :11.472   Max.   :15.18282  
   temprange        bathymean    
 Min.   : 4.229   Min.   :-4826  
 1st Qu.: 6.703   1st Qu.:-2013  
 Median : 7.998   Median : -364  
 Mean   : 9.883   Mean   :-1076  
 3rd Qu.:11.298   3rd Qu.:  -79  
 Max.   :21.156   Max.   :   96  

 10 Pseudo Absences dataset available ( PA1, PA2, PA3, PA4, PA5, PA6, PA7, PA8, PA9, PA10 ) with  84 (PA1, PA2, PA3, PA4, PA5, PA6, PA7, PA8, PA9, PA10) 
pseudo absences

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

show(cropped_present[[g1_present$select_var]])
class       : SpatRaster 
dimensions  : 323, 442, 10  (nrow, ncol, nlyr)
resolution  : 0.08333333, 0.08333333  (x, y)
extent      : -15.5, 21.33333, 46.41667, 73.33333  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs 
source(s)   : memory
names       : salinitymean, salin~range, curvelmin, curvelmean, curvelmax, curvelrange, ... 
min values  :     3.925527,    0.204385,  0.000463,   0.000095,  0.019337,    0.000003, ... 
max values  :    35.726175,   22.492664,  0.486692,   0.311682,  0.718712,    0.615551, ... 

test_mod <- BIOMOD_Modeling(
  bm.format = myBiomodData,
  # bm.options = opt.b,
  models = c("MAXNET", "RF"),
  CV.strategy = "random",
  CV.perc = 0.7,
  CV.nb.rep = 5,
  prevalence = 0.5,
  var.import = 5, 
  metric.eval = c('ROC','TSS','ACCURACY'),
  scale.models = TRUE,
  CV.do.full.models = FALSE,
  modeling.id = paste("symphodus.melops.test","models",sep="_")
)
show(test_mod)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= BIOMOD.models.out -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Modeling folder : output_test/symphodus.melops

Species modeled : symphodus.melops.test

Modeling id : symphodus.melops.test_models

Considered variables : salinitymean salinityrange curvelmin curvelmean curvelmax curvelrange tempmin tempmean temprange bathymean

Computed Models :  symphodus.melops.test_PA1_RUN1_RF symphodus.melops.test_PA1_RUN1_MAXNET symphodus.melops.test_PA1_RUN2_RF 
symphodus.melops.test_PA1_RUN2_MAXNET symphodus.melops.test_PA1_RUN3_RF symphodus.melops.test_PA1_RUN3_MAXNET symphodus.melops.test_PA1_RUN4_RF 
symphodus.melops.test_PA1_RUN4_MAXNET symphodus.melops.test_PA1_RUN5_RF symphodus.melops.test_PA1_RUN5_MAXNET symphodus.melops.test_PA2_RUN1_RF 
symphodus.melops.test_PA2_RUN1_MAXNET symphodus.melops.test_PA2_RUN2_RF symphodus.melops.test_PA2_RUN2_MAXNET symphodus.melops.test_PA2_RUN3_RF 
symphodus.melops.test_PA2_RUN3_MAXNET symphodus.melops.test_PA2_RUN4_RF symphodus.melops.test_PA2_RUN4_MAXNET symphodus.melops.test_PA2_RUN5_RF 
symphodus.melops.test_PA2_RUN5_MAXNET symphodus.melops.test_PA3_RUN1_RF symphodus.melops.test_PA3_RUN1_MAXNET symphodus.melops.test_PA3_RUN2_RF 
symphodus.melops.test_PA3_RUN2_MAXNET symphodus.melops.test_PA3_RUN3_RF symphodus.melops.test_PA3_RUN3_MAXNET symphodus.melops.test_PA3_RUN4_RF 
symphodus.melops.test_PA3_RUN4_MAXNET symphodus.melops.test_PA3_RUN5_RF symphodus.melops.test_PA3_RUN5_MAXNET symphodus.melops.test_PA4_RUN1_RF 
symphodus.melops.test_PA4_RUN1_MAXNET symphodus.melops.test_PA4_RUN2_RF symphodus.melops.test_PA4_RUN2_MAXNET symphodus.melops.test_PA4_RUN3_RF 
symphodus.melops.test_PA4_RUN3_MAXNET symphodus.melops.test_PA4_RUN4_RF symphodus.melops.test_PA4_RUN4_MAXNET symphodus.melops.test_PA4_RUN5_RF 
symphodus.melops.test_PA4_RUN5_MAXNET symphodus.melops.test_PA5_RUN1_RF symphodus.melops.test_PA5_RUN1_MAXNET symphodus.melops.test_PA5_RUN2_RF 
symphodus.melops.test_PA5_RUN2_MAXNET symphodus.melops.test_PA5_RUN3_RF symphodus.melops.test_PA5_RUN3_MAXNET symphodus.melops.test_PA5_RUN4_RF 
symphodus.melops.test_PA5_RUN4_MAXNET symphodus.melops.test_PA5_RUN5_RF symphodus.melops.test_PA5_RUN5_MAXNET symphodus.melops.test_PA6_RUN1_RF 
symphodus.melops.test_PA6_RUN1_MAXNET symphodus.melops.test_PA6_RUN2_RF symphodus.melops.test_PA6_RUN2_MAXNET symphodus.melops.test_PA6_RUN3_RF 
symphodus.melops.test_PA6_RUN3_MAXNET symphodus.melops.test_PA6_RUN4_RF symphodus.melops.test_PA6_RUN4_MAXNET symphodus.melops.test_PA6_RUN5_RF 
symphodus.melops.test_PA6_RUN5_MAXNET symphodus.melops.test_PA7_RUN1_RF symphodus.melops.test_PA7_RUN1_MAXNET symphodus.melops.test_PA7_RUN2_RF 
symphodus.melops.test_PA7_RUN2_MAXNET symphodus.melops.test_PA7_RUN3_RF symphodus.melops.test_PA7_RUN3_MAXNET symphodus.melops.test_PA7_RUN4_RF 
symphodus.melops.test_PA7_RUN4_MAXNET symphodus.melops.test_PA7_RUN5_RF symphodus.melops.test_PA7_RUN5_MAXNET symphodus.melops.test_PA8_RUN1_RF 
symphodus.melops.test_PA8_RUN1_MAXNET symphodus.melops.test_PA8_RUN2_RF symphodus.melops.test_PA8_RUN2_MAXNET symphodus.melops.test_PA8_RUN3_RF 
symphodus.melops.test_PA8_RUN3_MAXNET symphodus.melops.test_PA8_RUN4_RF symphodus.melops.test_PA8_RUN4_MAXNET symphodus.melops.test_PA8_RUN5_RF 
symphodus.melops.test_PA8_RUN5_MAXNET symphodus.melops.test_PA9_RUN1_RF symphodus.melops.test_PA9_RUN1_MAXNET symphodus.melops.test_PA9_RUN2_RF 
symphodus.melops.test_PA9_RUN2_MAXNET symphodus.melops.test_PA9_RUN3_RF symphodus.melops.test_PA9_RUN3_MAXNET symphodus.melops.test_PA9_RUN4_RF 
symphodus.melops.test_PA9_RUN4_MAXNET symphodus.melops.test_PA9_RUN5_RF symphodus.melops.test_PA9_RUN5_MAXNET symphodus.melops.test_PA10_RUN1_RF 
symphodus.melops.test_PA10_RUN1_MAXNET symphodus.melops.test_PA10_RUN2_RF symphodus.melops.test_PA10_RUN2_MAXNET symphodus.melops.test_PA10_RUN3_RF 
symphodus.melops.test_PA10_RUN3_MAXNET symphodus.melops.test_PA10_RUN4_RF symphodus.melops.test_PA10_RUN4_MAXNET symphodus.melops.test_PA10_RUN5_RF 
symphodus.melops.test_PA10_RUN5_MAXNET

Failed Models :  none

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

myBiomodEM <- BIOMOD_EnsembleModeling(bm.mod = test_mod,
                                      models.chosen = 'all',
                                      em.by = 'all',
                                      em.algo = c('EMwmean'),
                                      metric.select = c('TSS'),
                                      metric.select.thresh = c(0.7),
                                      metric.eval = c('TSS'),
                                      var.import = 5)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Build Ensemble Models -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

   ! all models available will be included in ensemble.modeling
  ! Ensemble Models will be filtered and/or weighted using validation dataset (if possible). Please use `metric.select.dataset` for alternative options.
   > Evaluation & Weighting methods summary :
      TSS over 0.7

  > mergedData_mergedRun_mergedAlgo ensemble modeling
   ! Additional projection required for ensemble models merging several pseudo-absence dataset...
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Do Single Models Projection -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    > Projecting symphodus.melops.test_PA1_RUN1_MAXNET ...
    > Projecting symphodus.melops.test_PA1_RUN1_RF ...
    > Projecting symphodus.melops.test_PA1_RUN2_MAXNET ...
    > Projecting symphodus.melops.test_PA1_RUN2_RF ...
    > Projecting symphodus.melops.test_PA1_RUN3_MAXNET ...
    > Projecting symphodus.melops.test_PA1_RUN3_RF ...
    > Projecting symphodus.melops.test_PA1_RUN4_MAXNET ...
    > Projecting symphodus.melops.test_PA1_RUN4_RF ...
    > Projecting symphodus.melops.test_PA1_RUN5_MAXNET ...
    > Projecting symphodus.melops.test_PA1_RUN5_RF ...
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Done -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

(skipped several lines detailing the projections here)

original models scores =  1 1 1 1 1 0.92 0.96 0.96 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.92 0.878 1 1 0.96 0.96 1 1 1 1 0.96 0.96 1 1 1 1 0.96 0.96 1 1 1 1 0.958 1 1 1 0.96 0.96 0.96 0.96 0.875 0.875 0.878 0.878 1 1 1 1 1 1 1 1 0.96 1 0.96 0.96 0.96 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.96 0.84 0.96 1 1 1 1 1 1 1 1 1 1
          final models weights =  0.01 0.01 0.01 0.01 0.01 0.009 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.009 0.009 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.009 0.009 0.009 0.009 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.009 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
   > Probabilities weighting mean by TSS ...Error in { : 
  task 1 failed - "task 1 failed - "task 1 failed - "cannot open the connection"""
In addition: Warning message:
In gzfile(file, "wb") :
  cannot open compressed file 'output_test/symphodus_melops/symphodus.melops.test/.BIOMOD_DATA/symphodus.melops.test_models/ensemble.models/ensemble.models.predictions/symphodus.melops.test_EMwmeanByTSS_mergedData_mergedRun_mergedAlgo.predictions', probable reason 'No such file or directory'

Environment Information

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22631)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8    LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] geosphere_1.5-18        fuzzySim_4.10.7         RStoolbox_0.4.0         CoordinateCleaner_3.0.1 reshape2_1.4.4          ggplot2_3.4.4          
 [7] biomod2_4.2-4           sf_1.0-15               rnaturalearthdata_0.1.0 rnaturalearth_1.0.1     terra_1.7-65            dismo_1.3-14           
[13] raster_3.6-26           sp_2.1-2                sdmpredictors_0.2.15    dplyr_1.1.4            

loaded via a namespace (and not attached):
  [1] colorspace_2.1-0       class_7.3-20           rgdal_1.6-7            rstudioapi_0.15.0      proxy_0.4-27           farver_2.1.1          
  [7] listenv_0.9.1          earth_5.3.2            prodlim_2023.08.28     fansi_1.0.6            lubridate_1.9.3        xml2_1.3.6            
 [13] codetools_0.2-18       splines_4.2.2          pkgload_1.3.4          Formula_1.2-5          modEvA_3.13.3          jsonlite_1.8.8        
 [19] mda_0.5-4              pROC_1.18.5            caret_6.0-94           rgeos_0.6-4            oai_0.4.0              compiler_4.2.2        
 [25] httr_1.4.7             Matrix_1.5-1           lazyeval_0.2.2         cli_3.6.2              tools_4.2.2            gtable_0.3.4          
 [31] glue_1.7.0             maps_3.4.2             Rcpp_1.0.12            PresenceAbsence_1.1.11 vctrs_0.6.5            nlme_3.1-160          
 [37] iterators_1.0.14       timeDate_4032.109      gower_1.0.1            exactextractr_0.10.0   stringr_1.5.1          globals_0.16.2        
 [43] maxnet_0.1.4           timechange_0.3.0       lifecycle_1.0.4        XML_3.99-0.16.1        future_1.33.1          MASS_7.3-58.1         
 [49] scales_1.3.0           ipred_0.9-14           parallel_4.2.2         TeachingDemos_2.12     rpart_4.1.19           reshape_0.8.9         
 [55] stringi_1.8.3          foreach_1.5.2          plotrix_3.8-4          randomForest_4.7-1.1   e1071_1.7-14           rgbif_3.7.9           
 [61] hardhat_1.3.0          lava_1.7.3             shape_1.4.6            rlang_1.1.3            pkgconfig_2.0.3        tidyterra_0.5.2       
 [67] lattice_0.22-5         purrr_1.0.2            labeling_0.4.3         recipes_1.0.9          tidyselect_1.2.0       parallelly_1.36.0     
 [73] gbm_2.1.9              plyr_1.8.9             magrittr_2.0.3         R6_2.5.1               generics_0.1.3         DBI_1.2.1             
 [79] pillar_1.9.0           whisker_0.4.1          withr_3.0.0            mgcv_1.8-41            units_0.8-5            survival_3.4-0        
 [85] abind_1.4-5            nnet_7.3-18            tibble_3.2.1           future.apply_1.11.1    xgboost_1.7.6.1        KernSmooth_2.23-20    
 [91] utf8_1.2.4             grid_4.2.2             data.table_1.15.0      ModelMetrics_1.2.2.2   plotmo_3.6.2           digest_0.6.34         
 [97] classInt_0.4-10        tidyr_1.3.0            stats4_4.2.2           munsell_0.5.0          glmnet_4.1-8           viridisLite_0.4.2   

Additional information The directory in the path in the error thrown indeed exists ('output_test/symphodus.melops/symphodus.melops.test/.BIOMOD_DATA/symphodus.melops.test_models/ensemble.models/ensemble.models.predictions/) and it is only when dir.name is "output_test/symphodus.melops" that there is any error during BIOMOD_EnsembleModeling.

HeleneBlt commented 3 months ago

Hi !

On Windows, you may have an issue with path length, in which case I suggest moving your folder higher in the folder hierarchy to have a shorter path. You can also have a look at the issue https://github.com/biomodhub/biomod2/issues/412 where Joost finds another solution. Tell me if this does the trick !

Hélène