padpadpadpad / nls.multstart

Non-linear regression in R allowing multiple start parameters
18 stars 4 forks source link

Changed the AIC to AICc #19

Closed MEladawi closed 1 year ago

MEladawi commented 1 year ago

I changed the AIC score to compare the models to AICc as AIC is restricted to data with large sample size, while AICc can work with any sample size.

mathesong commented 1 year ago

Hi Mahmoud,

Thanks so much for the PR, and for your contribution to this package!

Just a comment: I'm also relatively new to GitHub and its etiquette, and I'm certainly still learning, but there are a few things which are typically done. In this case, if one wishes to submit a PR, it's customary to first raise an issue about the changes that one thinks would be valuable, and to ask as part of that issue whether one could submit a PR to fix it. That way, the package maintainers can tell you that it is either a great suggestion and that they would like you to do so, or that someone else is working on something similar, or that the addition is of low priority and that they would prefer not to modify the package in that way.

Firstly, this PR involves the addition of the MuMIn package to the dependencies list in order to use the MuMIn::AICc function. This is all fine and well, except for the fact that in this case, and please correct me if I am wrong here, I think that using either the AIC, or the AICc, would yield identical results. If I understand correctly:

$$ AIC(M) = -2l + 2p $$

and

$$ AICc(M) = -2l + 2p + 2\frac{2p(p+1)}{n - p - 1} $$

where $M$ is the model, $l$ is the log-likelihood, $p$ is the number of parameters, and $n$ is the number of data points.

For the use case in nls.multstart, we are fitting the same model (i.e. the same $p$) multiple times to the same data (i.e. the same $n$), but with different starting parameters. As such, the only parameter which will differ between runs will be the log-likelihood, $l$. Hence

$$ \Delta AIC = -2 \Delta l $$

and

$$ \Delta AICc = -2 \Delta l $$

Do please tell me if I am missing something, but if I am correct, then I think it would preferable to maintain our use of the default stats package function instead of importing another package.

All the best, Granville