biomodhub / biomod2

BIOMOD is a computer platform for ensemble forecasting of species distributions, enabling the treatment of a range of methodological uncertainties in models and the examination of species-environment relationships.
77 stars 21 forks source link

Help with categorical data in biomod2 #464

Closed JurSeuren closed 3 weeks ago

JurSeuren commented 1 month ago

Hi everyone, I am currently using biomod2 to model the distribution of a group of reptile species in the Netherlands. For these models, I am also trying to incorporate some categorical data (like land use). I have made sure they are compatible with the climatic variables in extent, resolution and crs and the models have been computed without any issues. However, when I try to project the models to a new set of (the same) environmental variables, I get the following error:

Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: task 1 failed - "factor land use has new levels pastureland"

My guess is that somehow this particular level of the land use variable has not been sampled during the model fitting and thus is not recognized by the model when projecting. My question is if anyone knows if there is any way in which I can assure all levels of categorical input data to be sampled prior to projecting my models?

Thanks in advance for any suggestions!

MayaGueguen commented 1 month ago

Hello there,

Could you check two things please ? :eyes:

From the message you get, it seems that in your entry variable, you do not have the pastureland level within your variable. Normally, if your variable is correctly specified as categorical, biomod2 should manage by itself to sample all available levels.

:warning: do not hesitate, if your error still occurs, to provide me more information about your variables, and your code :pray:

Maya

JurSeuren commented 1 month ago

Hi, First of all: thank you very much for you reply! Regarding the points you raise:

The thing I find weird is that the data I used in BIOMOD_FormatingData() is very similar to the data for BIOMOD_Projection(). The basal data is actually the same, I just cut out some areas from the input data for BIOMOD_FormatingData() so these areas could be used for a case-study on model performance later on. Using the bm_SampleFactorLevels() function, I also made sure that each of the levels of the categorical data present in the complete dataset would still be respresented in the data that was supplied to BIOMOD_FormatingData().

Thanks again for your help!

Jur

MayaGueguen commented 1 month ago

Hello Jur,

Haha indeed, I would have been a bit disappointed with variable and level bodemsoort and dijk :smile:

Hence, I will need more informations to be able to help you. Could you please send me the output of :

? :pray:

Maya

JurSeuren commented 1 month ago

Hi Maya,

Thanks again for your reply. I no longer run into this issue, simply rerunning the models seems to have done the trick. Are you still interested in the mentioned output or should we leave it at this?

Best, Jur

MayaGueguen commented 3 weeks ago

If you have fix it, cool ! Do not hesitate to come back if you encounter again a similar error.

Maya