ConsBiol-unibern / SDMtune

Performs Variables selection and model tuning for Species Distribution Models (SDMs). It provides also several utilities to display results.
https://consbiol-unibern.github.io/SDMtune/
Other
25 stars 8 forks source link

[Bug]:doJK freezes when a continuous variable has many zeros #21

Closed arnanaraza closed 1 year ago

arnanaraza commented 2 years ago

Describe the bug

Hello Sergio,

I've been using SDMtune for a while (thanks for that!) and I'm using all possible models to ensemble them after. I'm doing a jackknife test for each model but it seems SDMtune::doJk freezes when one of the continuous variables has many zeros and the model is random forest.

Kindly look at my snippet where the SDMtune::doJk freezes at 6%

Thanks, Arnan

Steps to reproduce the bug

library(SDMtune)
# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd", full.names = TRUE)
predictors <- raster::stack(files)

#Modify one raster to have many zeros (still a continuous variable)
ras_zero <- predictors[[1]]
ras_zero
ras_zero[ras_zero < 285] <- 0
predictors[[1]] <- ras_zero
plot(predictors[[1]])

#Presence-absence data
p_coords <- virtualSp$presence
a_coords <- virtualSp$absence

# Create SWD object
data <- prepareSWD(species = "Virtual species", p = p_coords, a = a_coords,
                   env = predictors)

# Cross-validation and jackknife test
folds <- randomFolds(data, k = 10, only_presence = F,seed=25)
model <- SDMtune::train("RF", data,folds = folds,ntree=500)
doJk(model,'auc') #stops at 6%

Session information

R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] SDMtune_1.1.6 sp_1.5-0     

loaded via a namespace (and not attached):
 [1] progress_1.2.2       tidyselect_1.1.2     terra_1.6-17        
 [4] purrr_0.3.4          sf_1.0-8             lattice_0.20-45     
 [7] colorspace_2.0-3     vctrs_0.4.1          generics_0.1.3      
[10] utf8_1.2.2           rlang_1.0.6          e1071_1.7-11        
[13] pillar_1.8.1         glue_1.6.2           DBI_1.1.3           
[16] foreach_1.5.2        lifecycle_1.0.3      stringr_1.4.1       
[19] munsell_0.5.0        gtable_0.3.1         raster_3.6-3        
[22] codetools_0.2-18     class_7.3-20         fansi_1.0.3         
[25] Rcpp_1.0.9           KernSmooth_2.23-20   scales_1.2.1        
[28] classInt_0.4-7       ggplot2_3.3.6        hms_1.1.2           
[31] stringi_1.7.8        dplyr_1.0.10         dismo_1.3-9         
[34] grid_4.2.1           rgdal_1.5-32         cli_3.4.1           
[37] tools_4.2.1          magrittr_2.0.3       proxy_0.4-27        
[40] tibble_3.1.8         randomForest_4.7-1.1 pacman_0.5.1        
[43] crayon_1.5.1         pkgconfig_2.0.3      ellipsis_0.3.2      
[46] prettyunits_1.1.1    assertthat_0.2.1     rstudioapi_0.14     
[49] iterators_1.0.14     R6_2.5.1             units_0.8-0         
[52] compiler_4.2.1

Additional information

No response

Reproducible example

sgvignali commented 1 year ago

Hi, thank you for reporting the issue.

I believe this is not a problem of SDMtune. You can run the same script with all the other modelling algorithms and it doesn't freeze (i.e. BRT, ANN, Maxent, Maxnet). It seems the problem is caused by RF but I cannot do anything for it. In addition, if you run the same script without cross validation it works.

Please notice also that SDMtune doesn't support anymore the use of raster and it moved to terra, so you would have to slightly change your code. Moreover, the variable biomeshould be treated as a factor.

I close this issue because not related to SDMtune.