biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
85 stars 22 forks source link

Question about tuning Maxent #516

Open ls320 opened 2 days ago

ls320 commented 2 days ago

Hello, I am trying to tune Maxent model but not sure how it should be done. In my code below I specifyed a 5-fold cv in BIOMOD_modeling, but the the message said a 10-fold cv is used. I would like to know how is the traing done in this process? Which cv setting is used?

Also I didn't provide the range of model parameters to be tuned but the code can still run. Does it mean the model was not tuned or there is a default setting?

Thank you very much

code

data=BIOMOD_FormatingData(resp.name = 'SP1',resp.var=data_final$pr_ab,expl.var=data_final[c("wc2.1_2.5m_bio_1",'wc2.1_2.5m_bio_8')], PA.strategy='user.defined',
                            resp.xy=vd_final[c('lon','lat')], PA.user.table=sp_pseudo_table,PA.nb.rep=5)
model_output=BIOMOD_Modeling(data,models=c('MAXENT'),CV.strategy = 'kfold',
                             CV.nb.rep=1,CV.k=5,var.import=100,OPT.strategy='tuned')

message of BIOMOD_Modeling

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Build Single Models -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Checking Models arguments...

    > Automatic weights creation to rise a 0.5 prevalence
Creating suitable Workdir...

Checking Cross-Validation arguments...

   > k-fold cross-validation selection

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Build Modeling Options -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

    >  MAXENT options (datatype: binary , package: MAXENT , function: MAXENT )...
 No optimization of formula for MAXENT
        > Dataset _PA1_RUN1
            > Tuning parameters...Package ecospat is not installed, so Continuous Boyce Index (CBI) cannot be calculated.
*** Running initial checks... ***

* Variable values were input along with coordinates and not as raster data, so no raster predictions can be generated and AICc is calculated with background data for Maxent models.
* Model evaluations with random 10-fold cross validation...

*** Running ENMeval v2.0.4 with maxnet from maxnet package v0.1.4 ***

  |                                                                            |   0%
MayaGueguen commented 1 day ago

Hello there :wave:

Maxent tuning is a bit specific and is done through the ENMevaluate function from ENMeval package.

We set up partitions = "randomkfold" and partition.settings = list(kfolds = 10), which explains why you see the message :

Model evaluations with random 10-fold cross validation...

It will apply this random cross-validation over each of your PA x CV datasets. So here you have 5 pseudo-absence datasets x 5 kfold datasets = 25 models to be run, and for which tuning will be computed for each run.

Otherwise, yes, you cannot change by yourself the tuned parameters, which are set to : tune.args = list(rm = seq(0.5, 1, 0.5), fc = c("L")). If you want to be more specific in your tuning, please use directly by yourself the ENMevaluate function, and then provide your tuned parameter values to biomod2 functions using the OPT.strategy = "user.defined" option.

:eyes: I see that you put var.import = 100, I suggest you to put that to 10 only or it will take waaaaaaaaay too long.

Hope it helps, Maya

ls320 commented 1 day ago

Maya, Thank you for your explantion and advice! Does it mean that for each run, a 10 fold cv by ENMevaluate is first applied on the traing folds I set to tune the model? or it is applied on the whole traing+validation folds?

Thanks

MayaGueguen commented 21 hours ago

Yes, exactly ! It means that for example first for the PA1_RUN1 dataset, it will be divided in 10 fold CV and tuning is applied with that (but see the ENMevaluate function for more details). And so on for each dataset.

Maya