jamiemkass / ENMeval

R package for automated runs and evaluations of ecological niche models.
https://jamiemkass.github.io/ENMeval/
48 stars 30 forks source link

ENMeval : error in task 5 failed- "cannot evaluate a model without absence and presence data that are not NA" #121

Closed PetiteTong closed 2 years ago

PetiteTong commented 2 years ago

Hi,

I'm struggling with an error message with ENMevaluate. I find that Patrick meet the same bug in 2019, I referent it but still have a problem. I used function "get.checkerboard" , and I extract by it, all data shows NA, but When I plot the occs.grp and bg.grp, all points are in the field with values, I also checked points in ArcGIS, I don't know how to deal with it, maybe I did something wrong I didn't realized, hope you could help me, Thanks!

##occs is ok
envs_stack <- raster::mask(envs_stack, background) %>% raster::stack()
envs_stack$soil <- raster::as.factor(envs_stack$soil)
occs.cells <- raster::extract(envs_stack, occs, cellnumbers = TRUE)
occs.cell
##no NA values 
cb2 <- get.checkerboard2(occs, envs_stack, bg, aggregation.factor=c(5,5),gridSampleN = 10000)
occs.cells1 <- raster::extract(envs_stack, cb2$occs.grp)
occs.cell1
 ##show NA data
 occs.cells2 <- raster::extract(envs_stack, cb2$bg.grp)
 ##show NA data
 evalplot.grps(pts = occs, pts.grp = cb2$occs.grp, envs = envs_stack)
 ##show right sites in map
evalplot.grps(pts = bg, pts.grp = cb2$bg.grp, envs = envs_stack)
 ##show right sites in map

e.user <- ENMevaluate(occs, envs_stack, bg, algorithm = "maxnet", tune.args = tune.args, 
                      partitions = "checkerboard2",doClamp=FALSE,parallel= TRUE,numCores=20)

## error in task 5 failed- "cannot evaluate a model without absence and presence data that are not NA" 
#bug about NA

Thanks, Petite Tong

jamiemkass commented 2 years ago

Not sure what the problem with your ENMevaluate error is, but cb2$occs.grp should give you back a numeric vector that tells you which group each point is in, not the coordinates of the occurrence points. If you do something like occs[cb2$occs.grp == 1,], you'll get all the occurrence coordinates of the records that are in group 1. If you do extract() this way, do you still see NAs?

PetiteTong commented 2 years ago

thanks a lot, I tried the example again and realized I misunderstoodcb2$occs.grp. Just like you say. So I wander that if my rasters in envs_stack have the same cell size and the same extent, but the NA Values of the edge are not the same, could the ENMevaluate work well? Could this be a problem for running ENMevaluate ?

jamiemkass commented 2 years ago

Not sure I understand what you mean, but if you have occurrence points that land on the edge of a land grid cell and the other adjacent ones are ocean, for example, you'll need to either physically move it closer to the centroid of the land grid cell (i.e., fixing the georeferencing) or parameterize extract() to return cell values by "touching" instead of by proximity to centroid. I am pretty sure the extract() function in the terra package (kind of like an update for raster) can do this.

PetiteTong commented 2 years ago

Hello, thanks a lot. When I checked the model and the predictors with dichotomy method, I found that there was no problem with the model, but adding the soil type (classification variable) in the predictors would report an error. I have marked this variable with as.factor, and converted it into dataframe to check the NA value, it had the same amount of NA value as other variables, why did it report an error when adding this variable? Its classification content is the number, such as "23120201". How should I deal with this layer of variables? Did you meet this problems before?

jamiemkass commented 2 years ago

Using categorical variables does not cause errors for ENMevaluate(), so that by itself shouldn't cause issues. There might be problems if some partition groups are missing categories or have different ones from the training data. If you have many categories this could be the case. But the initial error you had made it seem that the input occurrences all had NA when using extract(). Are you still getting this error? Or is this a different issue?

PetiteTong commented 2 years ago

yes,still get this error, only if without soil, there will be no error

PetiteTong commented 2 years ago

yes, maybe some partition groups are missing categories or have different ones from the training data., I want to use block or checkerboard2 method, if so , does that mean that I can only useK-fold method?

jamiemkass commented 2 years ago

I think there would be an issue if the training data is missing a category that the validation data has. This usually shouldn't happen if the categories are distributed normally around the study extent. However, if you do spatial cross-validation and one of the blocks is the only area with some land use category, for example, the model trained on the other land uses cannot make a prediction for the validation data. I haven't tested this, but I'm pretty sure it would be result in an error. If you try random k-fold, do you avoid the error?

PetiteTong commented 2 years ago

yes, I will try to randomkfold and checkerboard2 method again.

PetiteTong commented 2 years ago

blockdoesn't fit my data, but checkerboard2 is OK! I will use it~ thanks a lot

jamiemkass commented 2 years ago

Excellent -- please let me know if you have any issues.