Closed ZoGo94 closed 1 year ago
Hi @ZoGo94 Thank you for reporting and having filled the issue template :pray:
The modeling and projection with categorical variable had many issues before and we tried to improve its management with the last update. Apparently it also broke some good behaviour, sorry for that :confused:
However it will be quite hard to debug without a reproducible example, or with access to your data (raster and points). Indeed it is generally tightly linked to the specific formatting of the data. If possible could you send the data in question ? Here is my mail: remi.patin@univ-grenoble-alpes.fr.
Concerning the evaluations, we had an error in previous version where validation metric were calculated with a threshold optimized over validation data, instead of the threshold optimized over calibration data. In short it is therefore expected to have lower validation metric. The point was raised in the changelog:
validation metric calculation now properly use the calibration threshold (i.e a threshold optimized on calibration data instead of validation data). This can lead to less optimistic threshold-dependent validation metric.
Best regards, Rémi
Hi Rémi,
Many thanks for your very fast response and information!
I would be happy to send you the data, however my email provider says I am unable to send to your address and says it is no longer valid?
All the best,
Zoe
I will also add that the error seems to be intermittent - sometimes it works, sometimes it doesn't :/
Hi Zoe, It is surprising indeed as the mail address had no mistake. You can also try to send them through our university cloud (https://nextcloud.osug.fr/index.php/s/Wk4HybpZ7b4yRTM) - it is set up as upload only, so only I can read the data sent there.
If the error is intermittent it may be linked to your data (presence/pseudo-absence) not sampling all factor levels. Although it is supposed to be done properly internally. But I can imagine edge case where all factor levels are initially sampled but then discarded due to NA
in other variables that would then lead to error when projecting with BIOMOD_Projection
as some factor levels will be unknown.
If it worked in previous version, it is also possible that the variable was not accounted as a categorical variable at that time.
Best, Rémi
Hi Zoe,
Thanks a lot for the data you sent, it makes it much easier for us to provide help :pray:
I corrected some additional internal issues with the management of categorical variables. If you update to the new github version with devtools::install_github('biomodhub/biomod2')
it should hopefully work.
NA
across layers, so some levels disappeared and projection will be restricted to area without any NA
in the data. In your situation this exclude most urban center. If you want projection on those area you have to restrict yourself with variables having values in those place as well. Here is what the mask looks like:
new.env.mask <- classify(as.numeric(!any(is.na(myExpl))),
matrix(c(0, NA,
1, 1),
ncol = 2, byrow = TRUE))
and you can apply it on your Environmental raster with:
myExpl.masked <- mask(myExpl, new.env.mask)
With that you can evaluate the frequency of all factor levels for the data that will effectively be used in the model:
freq(myExpl.masked$SoilType)
Best, Rémi
Hi Rémi,Thanks so so much for your time and feedback on this! Really can’t thank you enough!All the best,Zoe
I have an issue when using factor data in the modelling process – when I use the same environmental data layers (as a spatraster or as a rasterstack, ive tried both) for modelling and generating response curves and when projecting the results using the same environmental variables, it thinks there are new levels in the categorical variables when the exact same files have been used.
Error when plotting response curves and also when projecting:
Code run:
I get the same error when using this code:
And the same error when trying to project the ensemble with the same data.
Also I'm only getting this error since the May 2023 package update - it ran fine before. I've also noticed that the validation evaluation results indicate my models are poorer than they were before the package update