Closed LorenzoBernicchi closed 1 year ago
Hello Lorenzo,
Thank you for reporting :pray:
Unfortunately we are no gam
experts. I found some (old) post related to that error:
That error message and the warning indicate that the algorithm (iteratively reweighted least squares or IRLS) for fitting the parameters cannot converge, which can be because the model is over-specified. As you see, when you remove of the terms you do get parameter estimates and I would guess that, even then, the model will be over-specified.
So maybe this is an issue raising in some model that are overparameterized (which can depends on the randomness of the datasplit if you use cross-validation).
From there, they advise on using REML
method, which might help in solving (or understanding your issue if this still fail). You can change the gam method as follows:
myBiomodOptions <- BIOMOD_ModelingOptions(GAM = list("method" = "REML"))
If that does not help, you might have to ask the question to people more used to gam
.
Best regards, Rémi
Dear @rpatin, thank you very much for your answer!
I tried changing the method
in the gam
algorithm, but nothing seems to have changed.
This time, however, the errore message looks a bit different, here you can read it:
Error in gam.fit3(x = X, y = y, sp = L %*% lsp3 + lsp0, Eb = Eb, UrS = UrS, : inner loop 3; can't correct step size In addition: Warning messages: 1: In bm_RunModelsLoop(bm.format = bm.format, weights = weights, calib.lines = calib.lines,: Parallelisation with foreach is not available for Windows. Sorry. 2: executing %dopar% sequentially: no parallel backend registered Error in h(simpleError(msg, call)) : errore durante la valutazione dell'argomento 'object' nella selezione di un metodo per la funzione 'predict': object 'model.bm' not found
Would you suggest me to ask this question to someone more used to gam
as you said? May I ask you where I can post this question? Maybe on stackexchange?
Thank you in advance, I wish you a good day!
Dear Lorenzo,
Thank you for the update.
You may also try using package gam
instead of mgcv
:
myBiomodOptions <- BIOMOD_ModelingOptions(GAM = list("algo" = "GAM_gam")) # edit fixed mistake in "GAM_gam"
If that still does not work, you can indeed try to look for help elsewhere. I would have advised stackoverflow rather than stackexchange as it is more a technical issue rather than a statistical question (however I am no expert of those websites).
If you finally get an answer or solution, it would be great if you could update the issue with it.
Best regards, Rémi
Dear @rpatin,
So you would suggest me to specify the BIOMOD_ModelingOptions
like this:
myBIOMODoption <- BIOMOD_ModelingOptions( GAM = list(algo = "GAM_mgcv", method = "REML"))
?
Should I discard the method
part, and moreover should I use the GAM_gam
or GAM_mgcv
? I am afraid I misunderstood you.
Thank you very much!
Dear Lorenzo, Just as follows:
myBiomodOptions <- BIOMOD_ModelingOptions(GAM = list("algo" = "GAM_gam"))
Although it would not matter if you kept method = "REML"
it would just be ignored.
Sorry for the typo in "GAM_gam" it should be clearer now !
Best, Rémi
Thank you very much, I will try like this now and if it will not work I will ask on stackoverflow.
Thank you again, I wish you a good day!
Great, feel free to post here the link to your question. I can have a look if there are points you are not sure how to manage through biomod2
.
Thanks, have a good day as well!
It's still me, I have a little update:
I tried to use the GAM_gam
algorithm, but on the console appears only this, and after few seconds (like 2-3) the next algorithm start to process and it seems like the gam
isn't working:
I am using the GAM_gam
as algorithm and, after just 2-3 seconds, I get this in return:
-=-=-=--=-=-=- Capr.Carso_PA1_RUN2_GAM
Model=GAM
GAM_gam algorithm chosen
Moreover, after processing the last algorithm of the last run, I get this error message:
Error in { :
task 3 failed - "the namespace ‘mgcv’ is imported from ‘vegan’ thus cannot be downloaded"
In addition: Warning messages:
1: In bm_RunModelsLoop(bm.format = bm.format, weights = weights, calib.lines = calib.lines, :
Parallelisation with `foreach` is not available for Windows. Sorry.
2: executing %dopar% sequentially: no parallel backend registered
I take a look on my folders and all the single models were correctly created except for the GAM
.
However, I did have on my R environment the vegan
package and I tried also to upload it, but nothing changed.
Package gam
and mgcv
are quite in conflict and it is best that they are not loaded together.
The error message is slightly misleading, what you should do is unloading package vegan
:
detach("package:vegan", unload=TRUE)
Hopefully this should work. Best, Rémi
Dear @rpatin ,
I tried again and this time I get something a bit different.
When the gam
started working, I could read this in my console:
-=-=-=--=-=-=- Capr.Carso_PA1_RUN10_GAM
Model=GAM
GAM_gam algorithm chosen
*** single value predicted
! Note : Capr.Carso_PA1_RUN10_GAM failed!
Do you know why this could happen?
Dear Lorenzo, Perhaps, we are starting to see a way out then. Is it happening for all the cross-validation fold ? Or just some of them ? It would be interesting to see which fold the GAM is failing with and look at the data within that fold.
you can do something as follows to start exploring this idea:
plot(myBiomodData, calib.lines = myBiomodCV,
run = "RUN10", PA = "PA1") # use plot.type = "raster" if you have a huge dataset, this can be nicer
It happened for ALL the cross-validation (10 times since I set CV.nb.rep = 10
).
I am trying to look at what you asked me, but may I ask you what should be the myBiomodCV
object?
Sorry, myBiomodCV
was the output of either:
bm_CrossValidation
get_calib_lines(myBiomodModel)
with myBiomodModel
the output of BIOMOD_Modeling
BIOMOD_Modeling
(depending on how you managed your cross-validation)If that's common to all your fold, this may be easy to reproduce. I could try to have a look if you can send me the dataset (remi.patin@univ-grenoble-alpes.fr).
Does your other algorithm are OK ? they have good performance and are not overfitting for instance ?
I am modeling a species in a sub-region of Italy (quite small actually) using only 35 presence points of my species.
This is the result of the plot
you asked me before, if this is not what you were looking for just ask me:
However, my other algorithm do not seem ok. The RF
is still overfitting (with all the TSS
and AUC
values that are 1, I don't know if you remember an old issue I posted last month) and maybe also the GBM
doesn't work as it should. I attached here a table with the scores of all models.
<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
algo | metric.eval | cutoff | sensitivity | specificity | calibration | validation | evaluation -- | -- | -- | -- | -- | -- | -- | -- GLM | TSS | 478 | 83,333 | 58,886 | 0,424 | 0,413 | NA GLM | ROC | 478,5 | 83,333 | 59,043 | 0,665 | 0,649 | NA GBM | TSS | 514 | 100 | 98,414 | 0,984 | 0,163 | NA GBM | ROC | 520,5 | 100 | 98,514 | 0,996 | 0,885 | NA RF | TSS | 143 | 100 | 100 | 1 | 0 | NA RF | ROC | 143 | 100 | 100 | 1 | 0,679 | NA MAXENT | TSS | 487 | 79,167 | 90,771 | 0,7 | 0,354 | NA MAXENT | ROC | 488,5 | 79,167 | 90,914 | 0,902 | 0,817 | NA MAXNET | TSS | 488 | 75 | 90,471 | 0,655 | 0,349 | NA MAXNET | ROC | 491,5 | 75 | 90,571 | 0,889 | 0,792 | NA GLM | TSS | 440 | 91,667 | 57,671 | 0,495 | 0,302 | NA GLM | ROC | 444,5 | 91,667 | 58,671 | 0,738 | 0,643 | NA GBM | TSS | 543 | 100 | 98,671 | 0,987 | 0,166 | NA GBM | ROC | 548 | 100 | 98,714 | 0,995 | 0,693 | NA RF | TSS | 185 | 100 | 100 | 1 | 0 | NA RF | ROC | 186 | 100 | 100 | 1 | 0,657 | NA MAXENT | TSS | 216 | 95,833 | 76,986 | 0,729 | 0,039 | NA MAXENT | ROC | 217,5 | 95,833 | 77,243 | 0,931 | 0,682 | NA MAXNET | TSS | 213 | 100 | 74,3 | 0,745 | 0,201 | NA MAXNET | ROC | 222,5 | 100 | 76,171 | 0,93 | 0,694 | NA GLM | TSS | 407 | 87,5 | 65,886 | 0,535 | 0,389 | NA GLM | ROC | 412,5 | 87,5 | 66,386 | 0,839 | 0,71 | NA GBM | TSS | 544 | 100 | 98,814 | 0,988 | -0,012 | NA GBM | ROC | 549,5 | 100 | 98,857 | 0,996 | 0,741 | NA RF | TSS | 181 | 100 | 100 | 1 | 0 | NA RF | ROC | 181 | 100 | 100 | 1 | 0,648 | NA MAXENT | TSS | 226 | 95,833 | 77,329 | 0,732 | 0,319 | NA MAXENT | ROC | 233,5 | 95,833 | 78,2 | 0,932 | 0,751 | NA MAXNET | TSS | 411 | 83,333 | 88,186 | 0,716 | 0,153 | NA MAXNET | ROC | 412,5 | 83,333 | 88,386 | 0,911 | 0,741 | NA
Hello everyone, I am modeling a species distribution as I have done hundreds of times before.
I am using six different algorithms, namely
GLM, GBM, GAM, RF, MAXENT and MAXNET
. I did not encounter any problems, except for theGAM
. This algorithm takes a very long time to finish a single run (sometimes more than one hour) and in some of them I received this error:Error in gam.fit3(x = X, y = y, sp = L %*% lsp3 + lsp0, Eb = Eb, UrS = UrS, : inner loop 3; can't correct step size
. I am using the default modeling options for every algorithms, and I already used the same modeling procedure with the same data but this is the first time I am facing this error.Hope you can help me, let me know if you need additional information. Thanks in advance, best regards.
Lorenzo Bernicchi