biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
87 stars 22 forks source link

Help with BIOMOD_4.2-5-2 - New "bm_ModelingOptions" function #481

Closed guimaricato closed 3 months ago

guimaricato commented 3 months ago

Hello, biomod2 community,

I've recently migrated to the new version of biomod2 and am having some trouble transitioning from the BIOMOD_ModelingOptions function to the new bm_ModelingOptions function.

Previously, I used to specify the path to maxent.jar on my computer (instead of having multiple copies in working directories), but I'm not sure how to do this now. Additionally, I'm struggling with setting the directory for the explanatory .asc variables and the maximum number of background points created. I also used to set the GAM to a maximum of 4 knots.

path.to.maxent.jar <- file.path("D:/Documents/maxent", "maxent.jar")
bm.opt <- BIOMOD_ModelingOptions(GAM = list(k = 4),
                                 MAXENT = list(path_to_maxent.jar = path.to.maxent.jar,
                                               background_data_dir = maxent.background.dat.dir),
                                               maximumbackground = 10000)

Another question I have is regarding the "bigboss" and "tuned" strategies. Is it possible to manually adjust some algorithms using "user.defined" while having others follow the "bigboss" or "tuned" strategies, for example?

Thank you very much.

HeleneBlt commented 3 months ago

Hi !

First, if you haven't already done it, you can have a look at the vignette Modeling Options .

For your case, I will advise you to use the github version biomod2 4.2-6 as it will be easier. You can install it with: devtools::install_github("biomodhub/biomod2", dependencies = TRUE)

It will look like this:

user.gam <- list('for_all_datasets' = list(k = 4))

user.maxent <- list('for_all_datasets' = list(path_to_maxent.jar = path.to.maxent.jar,
                                              background_data_dir = maxent.background.dat.dir))

user.val <- list( GAM.binary.mgcv.gam = user.gam,
                  MAXENT.binary.MAXENT.MAXENT = user.maxent)

bm.opt <- bm_ModelingOptions(data.type = 'binary',
                             models = c('GAM', 'MAXENT', 'GLM'),
                             strategy = "user.defined",
                             user.val = user.val,
                             user.base = "bigboss",
                             bm.format = myBiomodData)

Please note that we removed the maximumbackground option as it was redundant with the number of PA.

In this case, the options you haven't changed and (I've added it for the example) the options for GLM will come from user.base : here bigboss.

If you want to tuned one model, you can do it before :

myOpt  <- bm_ModelingOptions(data.type = 'binary',
                             models = c('GAM', 'MAXENT', 'GLM', 'RF'),
                             strategy = 'bigboss',
                             bm.format = myBiomodData)

tuned.rf <- bm_Tuning(model = 'RF',
                      tuning.fun = 'rf', ## see in ModelsTable
                      bm.options = my.Opt@options$RF.binary.randomForest.randomForest,
                      bm.format = myBiomodData)

user.val <- list( GAM.binary.mgcv.gam = user.gam,
                  MAXENT.binary.MAXENT.MAXENT = user.maxent,
                  RF.binary.randomForest.randomForest = tuned.rf)

bm.opt <- bm_ModelingOptions(data.type = 'binary',
                             models = c('GAM', 'MAXENT', 'GLM' ,'RF'),
                             strategy = "user.defined",
                             user.val = user.val,
                             user.base = "bigboss",
                             bm.format = myBiomodData)

Hope it helps ! Don't hesitate if it is still not clear.

Hélène

guimaricato commented 3 months ago

Hey Hélène. That was very helpful and I was able to move forward following your script, thank you so much!

However, I'm still having problems. When I open the user.val file, it seems that the script is prioritizing 'args.default' for all algorithms, instead of 'args.values' (defined by me). For example, when I run bm_ModelingOptions, it says that there is no maxent.jar in getwd().

Do you know how I can “force” the script to prioritize used.defined? Thank you again!

arc1

arc2

arc3

HeleneBlt commented 3 months ago

Hi !

Your code is good. I didn't realize that this error message can be confusing 😬 When you're gonna run BIOMOD_Modeling with your bm.opt, it will like to the options in 'args.values'. 'args.default' is here so you can compare (and the warning appears when we construct 'args.default').

I need to find another wording or remove it completely.

Thanks for noticing and letting me know!

Hélène

guimaricato commented 3 months ago

Hey Hélène!

When testing earlier (before sending the last message), Maxent didn't run. It seems that it actually recognized the path described in args.default, instead of following what was defined in args.values. I've now tested by manually moving maxent.jar to getwd() and Maxent runs normally (following the args.default path), so I think there's a conflict between args.values and args.default. Anyway, I was able to run it by moving maxent.jar to getwd(), so it's not a big deal. Thank you very much for your help!

I'd also like to ask you something else, but if you think it might mess up this thread, I can open another one specifically for that. I'd like to understand the arguments of data.type. I'm using binary, but I assumed it would be better to use binary.PA since I don't have true absences. However, when testing binary.PA, the following message appeared:

Error in validObject(.Object) : invalid class “BIOMOD.models.options” object: 1: invalid object for slot "models" in class "BIOMOD.models.options": got class "NULL", should be or extend class "character" invalid class “BIOMOD.models.options” object: 2: invalid object for slot "options" in class "BIOMOD.models.options": got class "NULL", should be or extend class "list"

Could you please explain what they are or where I can find out more about the other data.type arguments (binary.PA, abundance, and compositional)? I looked at the website but couldn't find them. I'd also like to know what I need to change so that the above error doesn't occur.

Thank you very much for your attention and all the improvements in the biomod2.

HeleneBlt commented 3 months ago

Hello !

This error is still strange. If you want to investigate more, could you send me the error message? I also look too quickly at the beginning and I reread your past message: path_to_maxent_jar must be the path to the file maxent.jar and note the file itself, so here 'D:/OneDrive/Documents/R/maxent' !

But if you find a solution to run it smoothly, that's fine !

As for data.type, we have anticipated a bit. We're working on adding the possibility of using abundance data, so we've created this data.type argument but only "binary" is accept for this version of biomod2. And "binary.PA" will probably disappear as it's a little bit confusing.

Hélène

guimaricato commented 3 months ago

Hey Hélène!

Sorry if my messages were confusing, I'll try to better explain and will send the error below.

My maxent.jar file is in 'D:/OneDrive/Documents/R/maxent', the default directory of my computer. That way, I don't need to duplicate maxent.jar in each work directory. So I defined this address in args.values.

Regardless of what I define in args.values, it seems that the script always recognizes the address in args.default, i.e. getwd(), since it shows a warning saying that it didn't find maxent.jar in the work directory ('D:/OneDrive/Documents/R/Scripts/23_TursiopsBiscayne/Output/P1_Forage') and doesn't run Maxent afterwards in BIOMOD_Modeling.

To solve this, I had to make a copy of maxent.jar in the current work directory. This way it's working, but I'm just reporting this potential bug in case it can be fixed in the next versions (if it really is a bug, it might be a user problem haha).

Thank you! Cheers.

> maxent.background.dat.dir <- "maxent_bg"
>
> path.to.maxent.jar <- file.path("D:/OneDrive/Documents/R/maxent", "maxent.jar")
>
> user.gam <- list ('for_all_datasets' = list(k = 4))
> 
> user.maxent <- list('for_all_datasets' = list(path_to_maxent.jar = path.to.maxent.jar,
+                                               background_data_dir = maxent.background.dat.dir))
> 
> user.val <- list(GAM.binary.mgcv.gam = user.gam,
+                  MAXENT.binary.MAXENT.MAXENT = user.maxent)
> 
> bm.opt <- bm_ModelingOptions(data.type = 'binary',
+                              models = c('GLM', 'GAM', 'RF', 'GBM', 'MAXENT'),
+                              strategy = 'user.defined',
+                              user.val = user.val,
+                              user.base = 'bigboss',
+                              bm.format = bm.Tursiops)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Build Modeling Options -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    >  GLM options (datatype: binary , package: stats , function: glm )...
    >  GAM options (datatype: binary , package: mgcv , function: gam )...
    >  RF options (datatype: binary , package: randomForest , function: randomForest )...
    >  GBM options (datatype: binary , package: gbm , function: gbm )...
    >  MAXENT options (datatype: binary , package: MAXENT , function: MAXENT )...

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Done -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Warning messages:
1: In .bm_ModelingOptions.check.args(data.type = data.type, models = models,  :
  Only one GAM model can be activated. 'GAM.mgcv.gam' has been set (other available : 'GAM.gam.gam' or 'GAM.mgcv.bam')
2: In .BIOMOD.options.default.check.args(mod, typ, pkg, fun) :
  'maxent.jar' file is missing in current working directory (D:/OneDrive/Documents/R/Scripts/23_TursiopsBiscayne/Output/P1_Forage).
It must be downloaded (https://biodiversityinformatics.amnh.org/open_source/maxent/) and put in the working directory.

> {TursiopsModel <- BIOMOD_Modeling(bm.Tursiops,
+                                   modeling.id = "model_Tursiops",
+                                   models = c("GLM", "GAM", "RF", "GBM", "MAXENT"),
+                                   bm.options = bm.opt,
+                                   CV.strategy = "kfold",
+                                   CV.nb.rep = 10,
+                                   CV.k = 5,
+                                   CV.perc = 0.7,
+                                   CV.do.full.models = F,
+                                   prevalence = 0.5,
+                                   var.import = 3,
+                                   metric.eval = c("TSS", "ROC"))
+ 
+ beepr::beep(8)}

-=-=-=--=-=-=- Species_PA1_RUN1_MAXENT 
Warning in bm_RunModel(model = list.data[[ii]]$modi, run.name = list.data[[ii]]$run.name,  :
  'maxent.jar' file is missing in specified directory (D:/OneDrive/Documents/R/Scripts/23_TursiopsBiscayne/Output/P1_Forage).
It must be downloaded (https://biodiversityinformatics.amnh.org/open_source/maxent/) and put in that directory.

*** Error in MAXENT, no executable file
*** inherits(g.pred,'try-error')
   ! Note :  Species_PA1_RUN1_MAXENT failed!
HeleneBlt commented 3 months ago

Hello again!

I see what is going here! (Definitely, you find all our gaps in the documentation! )

So:

  1. Your path must be : path.to.maxent.jar <- "D:/OneDrive/Documents/R/maxent" and not
    path.to.maxent.jar <- file.path("D:/OneDrive/Documents/R/maxent", "maxent.jar")

  2. Please use the argument OPT.user and not bm.options. The last one is deprecated. We still let it so old scripts can run, but we will completely remove it at some point. (In fact, maybe I should have removed it in version 4.2-6 😅 ) So you just have to change bm.options by OPT.user

Sorry for all the "not so clear" infos! Documentation is a big part of the job and it can be hard to think about everything !

Hope everything will run smoothly now ! Hélène

guimaricato commented 3 months ago

Hi Hélène. It worked, thank you!

As you commented above, the warning continued but the model ran normally!

I'll probably have more questions soon 😂, but that's all for now. I know these updates are a lot of hard work, but the package is getting even better. Thanks to you and the whole team for that.

Cheers!

guimaricato commented 3 months ago

Hi, it's me again! 😅

Today I went back to run the script but I couldn't do it. I didn't make any changes to the script I had run (previous message), I just updated the dev version (from 4.2-6 to 4.2-6-1). I checked my 'bm.opt' file and all the algorithms have both '_allData_allRun' and 'for_all_datasets', as well as '_PA1_allRun, _PA2_allRun, ...'. Any thoughts about what could happen?

Thank you!

> path.to.maxent.jar <- 'D:/OneDrive/Documents/R/maxent'
> 
> user.gam <- list ('for_all_datasets' = list(k = 4))
> 
> user.maxent <- list('for_all_datasets' = list(path_to_maxent.jar = path.to.maxent.jar,
+                                               background_data_dir = maxent.background.dat.dir))
> 
> user.val <- list(GAM.binary.mgcv.gam = user.gam,
+                  MAXENT.binary.MAXENT.MAXENT = user.maxent)
> 
> bm.opt <- bm_ModelingOptions(bm.format = bm.Tursiops,
+                              data.type = 'binary',
+                              models = c('GLM', 'GAM', 'RF', 'GBM', 'MAXENT'),
+                              strategy = 'user.defined',
+                              user.val = user.val,
+                              user.base = 'bigboss')

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Build Modeling Options -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    >  GLM options (datatype: binary , package: stats , function: glm )...
    >  GAM options (datatype: binary , package: mgcv , function: gam )...
    >  RF options (datatype: binary , package: randomForest , function: randomForest )...
    >  GBM options (datatype: binary , package: gbm , function: gbm )...
    >  MAXENT options (datatype: binary , package: MAXENT , function: MAXENT )...

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Done -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Warning messages:
1: In .bm_ModelingOptions.check.args(data.type = data.type, models = models,  :
  Only one GAM model can be activated. 'GAM.mgcv.gam' has been set (other available : 'GAM.gam.gam' or 'GAM.mgcv.bam')
2: In .BIOMOD.options.default.check.args(mod, typ, pkg, fun) :
  'maxent.jar' file is missing in current working directory (D:/OneDrive/Documents/R/Scripts/23_TursiopsBiscayne/Output/P1_Forage).
It must be downloaded (https://biodiversityinformatics.amnh.org/open_source/maxent/) and put in the working directory.
> 
> {TursiopsModel <- BIOMOD_Modeling(bm.format = bm.Tursiops,
+                                   modeling.id = 'model_Tursiops',
+                                   models = c('GLM', 'GAM', 'RF', 'GBM', 'MAXENT'),
+                                   OPT.user = bm.opt,
+                                   CV.strategy = 'kfold',
+                                   CV.nb.rep = 10,
+                                   CV.k = 5,
+                                   CV.perc = 0.7,
+                                   CV.do.full.models = F,
+                                   prevalence = 0.5,
+                                   var.import = 3,
+                                   metric.eval = c('TSS', 'ROC'))
+ 
+ beepr::beep(8)}

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Build Single Models -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Checking Models arguments...

    > Automatic weights creation to rise a 0.5 prevalence
Creating suitable Workdir...

Checking Cross-Validation arguments...

   > k-fold cross-validation selection
Error in BIOMOD_Modeling(bm.format = bm.Tursiops, modeling.id = "model_Tursiops",  : 

names(OPT.user@options[['GLM.binary.stats.glm']]@args.values) must be '_PA1_RUN1', '_PA1_RUN2', '_PA1_RUN3', '_PA1_RUN4', '_PA1_RUN5', '_PA1_RUN6', '_PA1_RUN7', '_PA1_RUN8', '_PA1_RUN9', '_PA1_RUN10', '_PA1_RUN11', '_PA1_RUN12', '_PA1_RUN13', '_PA1_RUN14', '_PA1_RUN15', '_PA1_RUN16', '_PA1_RUN17', '_PA1_RUN18', '_PA1_RUN19', '_PA1_RUN20', '_PA1_RUN21', '_PA1_RUN22', '_PA1_RUN23', '_PA1_RUN24', '_PA1_RUN25', '_PA1_RUN26', '_PA1_RUN27', '_PA1_RUN28', '_PA1_RUN29', '_PA1_RUN30', '_PA1_RUN31', '_PA1_RUN32', '_PA1_RUN33', '_PA1_RUN34', '_PA1_RUN35', '_PA1_RUN36', '_PA1_RUN37', '_PA1_RUN38', '_PA1_RUN39', '_PA1_RUN40', '_PA1_RUN41', '_PA1_RUN42', '_PA1_RUN43', '_PA1_RUN44', '_PA1_RUN45', '_PA1_RUN46', '_PA1_RUN47', '_PA1_RUN48', '_PA1_RUN49', '_PA1_RUN50', '_PA2_RUN1', '_PA2_RUN2', '_PA2_RUN3', '_PA2_RUN4', '_PA2_RUN5', '_PA2_RUN6', '_PA2_RUN7', '_PA2_RUN8', '_PA2_RUN9', '_PA2_RUN10', '_PA2_RUN11', '_PA2_RUN12', '_PA2_RUN13', '_PA2_RUN14', '_PA2_RUN15', '_PA2_RUN16', '_PA2_RUN17', 
HeleneBlt commented 3 months ago

Hi !!

I'm sorry: we were trying to make better error messages and I didn't realize it would make this error after. 🙈 I just push a commit : normally it will work !

Sorry again :cherry_blossom:

Hélène

guimaricato commented 3 months ago

Hey Hélène :)

I ignored Maxent warning as you suggested and it worked perfectly at that time 🙏, but now I meant an error message that appeared when it wasn't possible to run BIOMOD_Modeling. I remember that I had managed to run the script without any problems, so I went back to this (after updating to v4.2-6-1) and this error happened.

Cheers!

Error in BIOMOD_Modeling(bm.format = bm.Tursiops, modeling.id = “model_Tursiops”, : 

names(OPT.user@options[['GLM.binary.stats.glm']]@args.values) must be...
HeleneBlt commented 3 months ago

Hello,

I'm a little bit confused! The new version of this morning (still called v4.2-6-1 even if I made a small change) doesn't correct your error or do you have a new one ? Could you try and reinstall biomod2 to test ? 🙏

Hélène

guimaricato commented 3 months ago

I'm sorry, I didn't realize you released a new version this morning. Now it's running! Hopefully, it'll be a while before I have to come back and bother you again haha. Merci!

HeleneBlt commented 3 months ago

No problem at all! It's perfectly normal to send an issue for this type of case! Glad to have someone who reacts quickly so that the error doesn't stay online for long! :muscle: