sylvainschmitt / SSDM

Stacked Species Distribution Modelling R package
Other
41 stars 17 forks source link

Normalization issue in model projection #98

Closed CarvalhoResearch closed 4 years ago

CarvalhoResearch commented 4 years ago

Hello Sylvain,

I used fenv<-stack(paste("/home/eduardo/preditor/future/",list.files('/home/eduardo/preditor/future/',pattern=''),sep='')) to stack future enviromental variables and I projected using algoritm_projection <- project(algoritm,fenv), but some maps have filled the entire area, for me this result is strange. See the Figures (second map is future distribution). Anadenanthera_colubrina Aspidosperma_multiflorum Aspidosperma_pyrifolium Bauhinia_cheilantha Bauhinia_pentandra Caesalpinia_bracteosa Chloroleucon_dumosum Cnidoscolus_phyllacanthus Combretum_leprosum Commiphora_leptophloeos Cordia_oncocalyx Erythroxylum_pungens Libidibia_ferrea Licania_rigida Mimosa_caesalpiniifolia Mimosa_tenuiflora Ziziphus_joazeiro

lukasbaumbach commented 4 years ago

Hi, such results can have a whole bunch of different reasons (strange model extrapolation behaviour, over-/underfitted models, errors in environmental data, etc.). To be able to help you, however, we need to see at least your call to the stack/ensemble/modelling function.

Best

CarvalhoResearch commented 4 years ago

My functions are structured using default options for the 9 algorithms (e.g.: GLM<- modelling('GLM',occ, env, Xcol = 'longitude', Ycol = 'latitude') and GBM <- modelling('GBM', occ, env, Xcol = 'longitude', Ycol = 'latitude')

follows the original script,

run_sdm.txt

Thank you!

lukasbaumbach commented 4 years ago

From your script it seems that you loaded your environmental variables with load_var and normalization (Norm=TRUE) for model training, but simply stacked your future conditions without normalization. As a side note, as your future values are probably much higher than the training value range (normalization converts to 0-1 range) the above projections already show you that your models are very liberal (maybe underfitted) and seem to have a linear response (higher variable values result in higher occurrence probability). For the future I recommend you to always check your inputs first before you go down the rabbit hole of modelling.

CarvalhoResearch commented 4 years ago

Very good! it really was that. Now I'm getting a project, but another problem arose, future projections are very similar to current projections, is it because the models are too liberal (maybe underfitted) or because the environmental variables are very similar ?. Another question is why the model is underfitted and what can I do to fit the model.

thank you very much

sylvainschmitt commented 4 years ago

Another question is why the model is underfitted and what can I do to fit the model.Another question is why the model is underfitted and what can I do to fit the model.

Have a look to previous issues: How improve species distribution models? #97

sylvainschmitt commented 4 years ago

Is it because the models are too liberal (maybe underfitted) or because the environmental variables are very similar ?

You're the one that can answer that, not us. Maybe it's a bit of both. But you have to check yourself variables correlations and the fit quality.

sylvainschmitt commented 4 years ago

Very good! it really was that.

Ok so I'm closing this issue. By the way, thanks @lukasbaumbach to always be the first "at the frontline" for new issues.