Open amythor opened 2 days ago
Hi Amy,
Thanks a lot for this very detailed issue !
If I understand well, you want to have an ensemble model for each pseudo-absences dataset. In this case, you can use em.by = "PA"
.
The models will be first filtered with the metric and the threshold you gave (TSS > 0.7 for you ) and after will be regrouped by PA datasets.
🖼️ If you want to have an illustration of biomod2 difference merging possibilities, you can have a look at this presentation (2.b ensemble models slide 33) 👀
If it is not what you want, feel free to ask ! Hélène
Hi Hélène,
Thank you so much for your prompt response, I super appreciate the biomod2 team's work on github and elsewhere, has made this such an easier process.
I guess my main question is if I can still merge using em.by = 'all' if I assigned different PA approaches to different algorithms. I would like my end product to be one final ensemble model (and ensemble forecast to project to future conditions) and I am struggling to do that after assigning different PA numbers to different algorithms (see my formatting and modelling functions).
Is this possible to combine all at once and have one final output using this PA approach? After working on it more this morning I think it could be I am misunderstanding the output in the console but it looked to me like it was only using the RF and GBM algos for which I assigned multiple PA datasets, leaving out the GLM, GAM, and MAXENT for which I assigned a singular PA approach?
What function can I use to see which models made it into the ensemble after the filtering process? This will probably help answer my question, I think I may just be confused because of what I was seeing in the console while it was processing.
Thanks again, Amy
Hello Amy,
First, to see the models which made it into the filtering process, you can use the function get_kept_models(POSK_Ensemble)
.
It is totally possible to use em.by = "all"
with different PA datasets for different algorithms. 💪
In your case, you can first look at the kept models with the function above. If you don't have the GLM, GAM and MAXENT, my first guess will be that the filter is a little bit too selective and rejects the GLM, GAM and MAXENT models.
You can have a look at the results with get_evaluations(POSK_ModelOutput)
to see how to adjust your filter. 📑
Let's me know if it corresponds to your case, Hélène
Hey biomod2 team,
I am struggling a bit to get the ensemble forecasting down. I had originally run my models using the same PA approach for all algos and found the ensemble process easy as I used em.by = 'all', which left me with a single output map when projecting for current and future periods. However I am now trying to use a different number of PA's per algo and am having trouble getting just one final output map when using the ensemble forecasting function. I understand I need to combine by PA+run first but is there a way to combine these again into one, and also if I do this is the code also first filtering models out that do not meet the threshold of TSS > 0.7? Or are all models being included (something I do not want)?
When I use em.by = 'all' i get this message in the console (and then it only uses RF and GBM for the ensemble):
! all models available will be included in ensemble.modeling
!!! Removed models using the Full dataset as ensemble models cannot merge repetition dataset (RUN1, RUN2, ...) with Full dataset unless em.by = 'PA+run'.
..and so on...
When I use em.by = 'PA+run' i get this message in the console: ! all models available will be included in ensemble.modeling
...And so on until....
Any help is much appreciated and please let me know if you require further code or output information. I tried to see if there was another question to this effect but could not find one. If this has already been addressed, please feel free to just link that github issue page.
Thank you in advance,
Amy
See code below:
format data
POSK_formatted <- BIOMOD_FormatingData( resp.name = "Species2", resp.var = POSK_FILTERED2, expl.var = var_det_clipped, dir.name = "Species1/Outputs_PowSki", resp.xy = POSK_projected[, c('X', 'Y')], na.rm = TRUE, filter.raster = TRUE, PA.nb.rep = 5, PA.nb.absences = c(10000, 1000, 1000, 1000, 1000), PA.strategy = 'disk', PA.dist.min = 5000, PA.dist.max = 100000 )
Perform cross-validation with both presences and absences
table_cv_POSK <- bm_CrossValidation( bm.format = POSK_formatted, # Your formatted biomod2 data strategy = "kfold", # Cross-validation strategy nb.rep = 3, # Number of repetitions k = 5, # Number of folds (0 for random cross-validation) do.full.models = TRUE # Whether to build full models )
calib_summary_POSK <- summary(POSK_formatted, calib.lines = table_cv_POSK) %>% filter(dataset == "calibration")
make model
POSK_ModelOutput <- BIOMOD_Modeling( bm.format = POSK_formatted, modeling.id = "AllModels", models = c("GAM", "GBM", "GLM", "MAXENT", "RF"), models.pa = list(RF = c("PA2", "PA3", "PA4", "PA5"), GBM = c("PA2", "PA3", "PA4", "PA5"), GLM = "PA1", GAM = "PA1", MAXENT = "PA1"), CV.strategy = "user.defined", CV.user.table = table_cv_POSK, CV.do.full.models = TRUE, OPT.user = model_parameters, OPT.strategy = "user.defined", weights = NULL, prevalence = NULL, metric.eval = c("KAPPA", "TSS", "ROC", "BOYCE"), var.import = 5, scale.models = FALSE, nb.cpu = 1, seed.val = 42, do.progress = TRUE )
Ensemble
POSK_Ensemble <- BIOMOD_EnsembleModeling( bm.mod = POSK_ModelOutput, models.chosen = 'all', em.by = "PA+run", em.algo = "EMmean", metric.select = "TSS", metric.select.thresh = c(0.7), metric.select.table = NULL, metric.select.dataset = 'validation', metric.eval = c("KAPPA", "TSS", "ROC", "BOYCE"), var.import = 5, EMci.alpha = 0.05, EMwmean.decay = "proportional", nb.cpu = 1, seed.val = NULL, do.progress = TRUE, )
2011-2040 SSP1
POSK_Proj1 <- BIOMOD_Projection(bm.mod = POSK_ModelOutput, proj.name = 'SSP1_2011-2040', new.env = proj1_clipped, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all', build.clamping.mask = TRUE)
Project ensmble
Project ensemble models (building single projections) - Projecting current?
POSK_Ensemble_Proj <- BIOMOD_EnsembleForecasting(bm.em = POSK_Ensemble, new.env = var_det_clipped, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all', proj.name = 'CurrentEM_SSP1_2011_2040') plot(POSK_Ensemble_Proj)
Project ensemble models (from single projections) - Projecting future?
POSK_Ensemble_Proj1_Future <- BIOMOD_EnsembleForecasting(bm.em = POSK_Ensemble, bm.proj = POSK_Proj1, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all', proj.name = 'Ensemble_Proj_SSP1_2011_2040') plot(POSK_Ensemble_Proj1_Future)