Closed scrameri closed 3 years ago
Hi Simon, thanks for reporting the problem.
The error occurs because predict.randomForest()
expects a factor for the variable biome but in the raster stack object the variable is numeric.
Please refer to a previous issue for the explanation: https://github.com/ConsBiol-unibern/SDMtune/issues/8.
As you can read in the other issue is possible to solve the problem passing the argument factors
to the predict()
function. However this was not possible for the modelReport()
and I have added the new argument factors
.
Please install the GitHub version and let me know if this solve the problem.
Hi Sergio,
Thanks very much for implementing the factors
argument in modelReport()
, it works!
By sampling many background points I made sure that all factor levels of categorical.predictors
are represented in the training dataset and RF model. Using SDMtune
version 1.1.3.9000
and the code below, all types of predictors and all the factor levels match up. One has to make sure that the passed argument factors
only contains elements (with factor levels) of variables used in the model. Also works in the case of an empty named list (i.e. when no categorical predictors are used in the model).
Best wishes, Simon
> categorical.predictors
[1] "eco2017" "geology" "vegetation"
> used <- names(model@data@data)
> f <- lapply(as.list(data@data[,categorical.predictors]), levels)[used[used %in% categorical.predictors]]
> f # here, geology was not used in the model, and was removed from the list before executing modelReport()
$eco2017
[1] "1" "2" "3" "4" "5" "6" "7"
$vegetation
[1] "1" "2" "3" "4" "5" "6" "7" "9" "10" "11" "12" "13" "14" "15" "16" "19" "22" "23" "25"
> modelReport(model = model, folder = folder, test = test, type = NULL,
response_curves = FALSE, only_presence = TRUE,
jk = FALSE, env = predictors[[used]], clamp = TRUE, permut = 10,
factors = f)
── Model Report - method: RF ──────────────────────────────────── chermezonii ──
✓ Saving files...
✓ Plotting ROC curve...
✓ Computing thresholds...
✓ Predicting distribution map...
✓ Computing variable importance...
✓ Writing model settings...
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] leaflet_2.0.3 ConR_1.3.0 plotROC_2.2.1 ggplot2_3.3.3 kableExtra_1.3.1
[6] raster_3.4-5 SDMtune_1.1.3.9000 sp_1.4-4
Dear Sergio et al.,
I've been trying out SDMtune, and I really like the streamlined analysis approach, visual feedback, and the genetic algorithm for reducing the hyperparameter search space. Good job!
Today I experimented with different model methods, and all works fine so far with the Maxnet, Maxent, BRT and ANN methods. However, there is an issue with the RF method, see BUG report below.
The same error appears using my own data, after variable selection, hyperparameter tuning and model parsimony optimization. The error message suggests that
predict.randomForest()
cannot handle the passed argumentnewdata
, but I couldn't figure out what happens.Am I doing something wrong? Any help would be warmly appreciated.
Many thanks and best wishes from Zurich, Simon
Describe the bug
modelReport()
with the RF method cannot write predicted distribution map using the default virtualSp dataset.To Reproduce
Expected behavior The
modelReport()
function is expected to run through using various model methods.Add here the error message:
Additional Context