biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
85 stars 22 forks source link

BIOMOD_PresenceOnly error #106

Closed chenyongpeng1 closed 2 years ago

chenyongpeng1 commented 2 years ago

Dear,all: When I run these codes,it is error!

Evaluate models with Boyce index and MPA

myBiomodPO <- BIOMOD_PresenceOnly(bm.mod = myBiomodModelOut, bm.em = myBiomodEM) myBiomodPO

Evaluate models with Boyce index and MPA (using background data)

myBiomodPO <- BIOMOD_PresenceOnly(bm.mod = myBiomodModelOut, bm.em = myBiomodEM, bg.env = getValues(myExpl)) myBiomodPO

HOW TO solve it?

cheers,chen

rpatin commented 2 years ago

Hello Chen, Did you try to run the code from the function examples ? Unfortunately I could not reproduce the error on my end. But we have made some correction to this function in the past few days. Could you:

  1. update the package with: devtools::install_github("biomodhub/biomod2")
  2. re-run your example to confirm that you still have an error with the new package version.
  3. If you still have an error, please make sure to send us a reproducible examples to help us understand the error you encounter.

Cheers, Rémi

chenyongpeng1 commented 2 years ago

Dear biomod team:

Thank you for your time and help!I tried to use the new package "biomod4.1“.And I used my own species data,the code is as follows.

  1. My thinned species have 10000 present record(species,lat,long),no absence.so I randomly selecct 10000 Pseudo-absences points. If I add human distribution,land use,and light environment data ,my TSS is uder 0.6,even 0.5.So how to improve my tss? 2.> BIOMOD_PresenceOnly is error! image 3.If I want to konw the ssp245 ensemble model myBiomodRangeSize,How should I select CurrentProj and FutureProj, image I remember the old version select ### name.model_EMcaByTSS_mergedAlgo_mergedRun_mergedData_TSSbin.img,but in this version I can not it.

Thank you very much,cheers chen library(devtools) devtools::install_github("biomodhub/biomod2", dependencies = TRUE) library(biomod2) library(ggplot2) library(gridExtra) library(raster) library(rasterVis) library(lattice) library(sp) library(spThin) setwd("E:/Apis_9.24")

DataSpecies <- read.csv('apis_thinned.csv')

data(DataSpecies) head( DataSpecies )

Successfully running rnorm function

thinned_dataset_full <- thin( loc.data = DataSpecies, lat.col = "latitude", long.col = "longitude", spec.col = "species", thin.par =20, reps = 1, locs.thinned.list.return = TRUE, write.files = TRUE, max.files = 5, out.dir = "apis_thinned_full/", out.base = "apis_thinned", write.log.file = TRUE, log.file = "apis_thinned_full_log_file.txt" )

head(DataSpecies)

myRespName <- 'apis mellifera' DataSpecies <- data.frame(DataSpecies, occ=1) myResp <- as.numeric(DataSpecies[, "occ"])

myRespXY <- DataSpecies[, c('longitude', 'latitude')]

bio_3<- raster('D:/bio/bioclim/bioclim/bio_3.tif')

bio_8<- raster('D:/bio/bioclim/current/bio_8.tif')

bio_9<- raster('D:/bio/bioclim/current/bio_9.tif')

bio_10<- raster('D:/bio/bioclim/current/bio_10.tif')

bio_11<- raster('D:/bio/bioclim/current/bio_11.tif')

bio_1 <-raster("bioclim/bio_1.tif") bio_2 <-raster("bioclim/bio_2.tif") bio_3 <-raster("bioclim/bio_3.tif") bio_4 <-raster("bioclim/bio_4.tif") bio_5 <-raster("bioclim/bio_5.tif") bio_6 <-raster("bioclim/bio_6.tif") bio_7 <-raster("bioclim/bio_7.tif") bio_8 <-raster("bioclim/bio_8.tif") bio_9 <-raster("bioclim/bio_9.tif") bio_10 <-raster("bioclim/bio_10.tif") bio_11 <-raster("bioclim/bio_11.tif") bio_12 <-raster("bioclim/bio_12.tif") bio_13 <-raster("bioclim/bio_13.tif") bio_14 <-raster("bioclim/bio_14.tif") bio_15 <-raster("bioclim/bio_15.tif") bio_16 <-raster("bioclim/bio_16.tif") bio_17 <-raster("bioclim/bio_17.tif") bio_18 <-raster("bioclim/bio_18.tif") bio_19 <-raster("bioclim/bio_19.tif") elev<-raster("bioclim/elev.tif")

land_use<-raster("E:/land_use/LULC-files/Historical/LULC_850_2015/LULC_2015__CMIP6.1.tif")

light<-raster("E:/land_use/Harmonized_DN_NTL_2017_Light.tif") human<-raster("E:/land_use/hii-global-geo-grid/hii_v2geo1.tif") print("area1 range raster loaded")

范围一致性

extent(land_use) <- extent(bio_1)

重采样进行修改土地利用与bio_1的的dimension,resolution.

land_use <- projectRaster(land_use,bio_1,method = 'ngb')

light

extent(light) <- extent(bio_1)

light<- projectRaster(light,bio_1,method = 'bilinear')

human

extent(human) <- extent(bio_1)

human<- projectRaster(human,bio_1,method = 'bilinear')

myExpl <- stack(c(bio_1,bio_2,bio_12,bio_17,bio_11,bio_4,bio_18,elev))

plot(myExpl)

myBiomodData <- BIOMOD_FormatingData(resp.var = myResp, expl.var = myExpl, resp.xy = myRespXY, resp.name = myRespName) myBiomodData plot(myBiomodData)

Pseudo-absences extraction

Transform true absences into potential pseudo-absences

myResp.PA <- ifelse(myResp == 1, 1, NA)

Format Data with pseudo-absences : random method

myBiomodData.r <- BIOMOD_FormatingData(resp.var = myResp.PA, expl.var = myExpl, resp.xy = myRespXY, resp.name = myRespName, PA.nb.rep = 2, PA.nb.absences = 10000, PA.strategy = 'random') plot(myBiomodData.r)

Create the different validation datasets

myBiomodCV <- BIOMOD_CrossValidation(bm.format = myBiomodData.r) head(myBiomodCV) myBiomodOptions <- BIOMOD_ModelingOptions()

myBiomodModelOut <- BIOMOD_Modeling(bm.format = myBiomodData.r, modeling.id = 'AllModels', bm.options = myBiomodOptions, nb.rep = 2, data.split.perc = 80, metric.eval = c('TSS','ROC'), var.import = 3, do.full.models = FALSE, nb.cpu = 8) myBiomodModelOut

Get evaluation scores & variables importance

get_evaluations(myBiomodModelOut) get_variables_importance(myBiomodModelOut, as.data.frame = TRUE)

Represent evaluation scores & variables importance

bm_PlotEvalMean(bm.out = myBiomodModelOut) bm_PlotEvalBoxplot(bm.out = myBiomodModelOut, group.by = c('algo', 'run')) bm_PlotVarImpBoxplot(bm.out = myBiomodModelOut, group.by = c('expl.var', 'algo', 'algo')) bm_PlotVarImpBoxplot(bm.out = myBiomodModelOut, group.by = c('expl.var', 'algo', 'dataset')) bm_PlotVarImpBoxplot(bm.out = myBiomodModelOut, group.by = c('algo', 'expl.var', 'dataset'))

Represent response curves

bm_PlotResponseCurves(bm.out = myBiomodModelOut, models.chosen = get_built_models(myBiomodModelOut)[c(1:3, 12:14)], fixed.var = 'median') bm_PlotResponseCurves(bm.out = myBiomodModelOut, models.chosen = get_built_models(myBiomodModelOut)[c(1:3, 12:14)], fixed.var = 'min') bm_PlotResponseCurves(bm.out = myBiomodModelOut, models.chosen = get_built_models(myBiomodModelOut)[9], fixed.var = 'median', do.bivariate = TRUE)

Ensemble models

Model ensemble models

myBiomodEM <- BIOMOD_EnsembleModeling(bm.mod = myBiomodModelOut, models.chosen = 'all', em.by = 'all', metric.select = c('TSS'), metric.select.thresh = c(0.6), var.import = 3, metric.eval = c('TSS', 'ROC'), prob.mean = TRUE, prob.median = TRUE, prob.cv = TRUE, prob.ci = TRUE, prob.ci.alpha = 0.05, committee.averaging = TRUE, prob.mean.weight = TRUE, prob.mean.weight.decay = 'proportional') myBiomodEM

Get evaluation scores & variables importance

get_evaluations(myBiomodEM, as.data.frame = TRUE) get_variables_importance(myBiomodEM, as.data.frame = TRUE)

Represent evaluation scores & variables importance

bm_PlotEvalMean(bm.out = myBiomodEM, group.by = 'model') bm_PlotEvalBoxplot(bm.out = myBiomodEM, group.by = c('model', 'model')) bm_PlotVarImpBoxplot(bm.out = myBiomodEM, group.by = c('expl.var', 'model', 'model')) bm_PlotVarImpBoxplot(bm.out = myBiomodEM, group.by = c('expl.var', 'model', 'dataset')) bm_PlotVarImpBoxplot(bm.out = myBiomodEM, group.by = c('model', 'expl.var', 'dataset'))

Represent response curves

bm_PlotResponseCurves(bm.out = myBiomodEM, models.chosen = get_built_models(myBiomodEM)[c(1, 6, 7)], fixed.var = 'median') bm_PlotResponseCurves(bm.out = myBiomodEM, models.chosen = get_built_models(myBiomodEM)[c(1, 6, 7)], fixed.var = 'min') bm_PlotResponseCurves(bm.out = myBiomodEM, models.chosen = get_built_models(myBiomodEM)[7], fixed.var = 'median', do.bivariate = TRUE)

Evaluate models with Boyce index and MPA( Do Presence-Only Evaluation)

myBiomodPO <- BIOMOD_PresenceOnly(bm.mod = myBiomodModelOut, bm.em = myBiomodEM) myBiomodPO

Evaluate models with Boyce index and MPA (using background data)

myBiomodPO <- BIOMOD_PresenceOnly(bm.mod = myBiomodModelOut, bm.em = myBiomodEM, bg.env = getValues(myExpl)) myBiomodPO myBiomodProj <- BIOMOD_Projection(bm.mod = myBiomodModelOut, proj.name = 'Current', new.env = myExpl, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all', build.clamping.mask = TRUE) myBiomodProj plot(myBiomodProj)

Project ensemble models (from single projections)

myBiomodEMProj <- BIOMOD_EnsembleForecasting(bm.em = myBiomodEM, bm.proj = myBiomodProj, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all')

Project ensemble models (building single projections)

myBiomodEMProj <- BIOMOD_EnsembleForecasting(bm.em = myBiomodEM, proj.name = 'CurrentEM', new.env = myExpl, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all') myBiomodEMProj plot(myBiomodEMProj)

Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)

myFiles = paste0('external/bioclim/future/bio', c(3, 4, 7, 11, 12), '.grd')

myExplFuture = raster::stack(system.file(myFiles, package = 'biomod2'))

ssp245_2030_bio_1 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_1.tif") ssp245_2030_bio_2 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_2.tif") ssp245_2030_bio_4 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_4.tif") ssp245_2030_bio_11 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_11.tif") ssp245_2030_bio_12 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_12.tif") ssp245_2030_bio_17 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_17.tif") ssp245_2030_bio_18 <-raster("E:/Apis_9.24/2030_ssp245_ensemble/bio_18.tif") myExplFuture_ssp245_2030 <- stack(c(ssp245_2030_bio_1,ssp245_2030_bio_2,ssp245_2030_bio_4, ssp245_2030_bio_11,ssp245_2030_bio_12,ssp245_2030_bio_17,ssp245_2030_bio_18,elev))

print("2030 ssp245 BIOMOD_EnsembleForcasting complete")

Project onto future conditions

myBiomodProjectionFuture <- BIOMOD_Projection(bm.mod = myBiomodModelOut, proj.name = 'Future', new.env = myExplFuture_ssp245_2030, models.chosen = 'all', metric.binary = 'TSS', build.clamping.mask = TRUE) myBiomodProjectionFuture plot(myBiomodProjectionFuture)

Project ensemble models (from single projections)

myBiomodEMProj_ssp245_2030 <- BIOMOD_EnsembleForecasting(bm.em = myBiomodEM, bm.proj = myBiomodProjectionFuture, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all')

Project ensemble models (building single projections)

myBiomodEMProj_ssp245_2030 <- BIOMOD_EnsembleForecasting(bm.em = myBiomodEM, proj.name = 'CurrentEM', new.env = myExplFuture_ssp245_2030, models.chosen = 'all', metric.binary = 'all', metric.filter = 'all')

myBiomodEMProj_ssp245_2030 plot(myBiomodEMProj)

Load current and future binary projections

CurrentProj <- stack("apis.mellifera/proj_Current/proj_Current_apis.mellifera_ensemble_TSSbin.gri") FutureProj <- stack("apis.mellifera/proj_Future/proj_Future_apis.mellifera_ensemble_TSSbin.gri")

Compute differences

myBiomodRangeSize <- BIOMOD_RangeSize(proj.current = CurrentProj, proj.future = FutureProj) myBiomodRangeSize$Compt.By.Models plot(myBiomodRangeSize$Diff.By.Pixel)

Represent main results

gg = bm_PlotRangeSize(bm.range = myBiomodRangeSize, do.count = TRUE, do.perc = TRUE, do.maps = TRUE, do.mean = TRUE, do.plot = TRUE, row.names = c("Species", "Dataset", "Run", "Algo"))

rpatin commented 2 years ago

Dear Chen, A few answers to your questions:

  1. Low model performance may have many non-exclusive explanations, each associated with different questions:

    • Important explanatory variable may be missing. What do current knowledge on the species' ecology tells ?
    • Presence data may suffer from many collection bias that can hinder model performance. How were data collected ?
    • Selection method for pseudo-absences have ecological assumptions associated with them. Learn more with videos available on biomod2 website https://github.com/biomodhub/biomod2 and especially the one on biomod2 specificity : https://www.youtube.com/watch?v=hEjhdURRy3o
    • The study scale may have important consequences on results. How was chosen your extent ?
    • Performance may differ a lot across models for many reasons (more appropriate model, default configuration is sometimes inefficient, too many or not enough pseudo-absences ...). Have all models low performance ? Or do some of them perform better ?
  2. Did you reinstalled the biomod2 package as suggested ? Thanks for the code. Unfortunately we could not reproduce the error. Could you send us your data or run your code with a public dataset (e.g. Gulo Gulo) so that we may understand your error.

  3. You may access all Ensemble models in the folder you showed. They are stacked within the proj_CurrentEM_GuloGulo_ensemble_TSSbin.grd (or the equivalent name for your project). Alternatively you may access individual ensemble models (such as EMca) in the folder individual_projections.

Hope this helps, Cheers, Rémi

chenyongpeng1 commented 2 years ago

Thank you for your help and reply

MayaGueguen commented 2 years ago

Hello Chen,

Thanks for reporting :pray: I just made an update correcting the error you found with BIOMOD_PresenceOnly : hopefully it is completely corrected now :heavy_check_mark:

Sorry for the inconvenience, Maya