lbbe-software / nlstools

Tools for Nonlinear Regression Analysis
6 stars 1 forks source link

Help with calculating CI from an nls model #4

Closed trosendal closed 2 years ago

trosendal commented 2 years ago

Hi,

We are running a regression on some data that gives strange results since the confidence interval of the decay parameter includes 0 which results in a negative half life. We are likely misunderstanding something about the model specification or the estimates from the confint2 function. I wonder if it is obvious to you how the following is invalid?

library(nlstools)
df <- data.frame(week = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
                          1L, 1L, 1L, 1L, 1L, 1L),
                 count = c(1.3e+09, 9e+08, 4e+09, 1.7e+09, 1.2e+09,
                           3e+09, 1.58e+09, 1.6e+09, 3e+09, 1e+09,
                           1100000, 2e+06, 1700000, 4e+06, 1400000,
                           4e+06))

## taking log to get starting values for nls
model_lm <- lm(log(count)~week, data = df)

## extract the coefficients and to convert values back to non-log
intercpt <- exp(model_lm$coefficients[1])
coeff<- model_lm$coefficients[2]

## Run the nls model for exponential decay
model <- nls(count ~ b0 * exp(-b1 * week),
             start = list(b0 = intercpt,
                          b1 = coeff),
             trace=T,
             data = df)

## Point estimate of decay parameter
b1 <- coef(model)[2]

## 95% CI for the decay parameter
low <- confint2(model)[2,1]
high <- confint2(model)[2,2]

## What is the half life ?
log(2)/b1
log(2)/high
log(2)/low
mldelignette commented 2 years ago

Your example is an example of many things we should never do ! See following comments added in your initial script :

library(nlstools)
df <- data.frame(week = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
                          1L, 1L, 1L, 1L, 1L, 1L),
                 count = c(1.3e+09, 9e+08, 4e+09, 1.7e+09, 1.2e+09,
                           3e+09, 1.58e+09, 1.6e+09, 3e+09, 1e+09,
                           1100000, 2e+06, 1700000, 4e+06, 1400000,
                           4e+06))

## taking log to get starting values for nls
(model_lm <- lm(log(count)~week, data = df))
plot(log(count)~week, data = df)

# BE CAREFUL !
# it is not reasonable at all to fit a model with only
# two different values of the independent variable

plot(count ~ week, data = df)
# BE CAREFUL !
# non linear regression assumes a Gaussian error model
# with constant variance !
# It is clearly not the case on counts without log tranformation

## extract the coefficients and to convert values back to non-log
(intercpt <- exp(model_lm$coefficients[1]))
(coeff <- model_lm$coefficients[2])
# BE CAREFUL, you miss - : b1 is positive in the model you wrote below

## Run the nls model for exponential decay
model <- nls(count ~ b0 * exp(-b1 * week),
             start = list(b0 = intercpt,
                          b1 = coeff),
             trace=T,
             data = df)
# NOT REASONABLE AT ALL due to 
# non respect of the Gaussian error model 
# You should always look at the residuals !
plot(nlsResiduals(model))

## Point estimate of decay parameter
(b0 <- coef(model)[1])
(b1 <- coef(model)[2])

## 95% CI for the decay parameter
(low <- confint2(model)[2,1])
(high <- confint2(model)[2,2])

# The default option of confint2( ) is the asymptotic calculation
# of confidence intervals, which cannot give reasonable results
# in case of strong violation of conditions of use on non
# linear regression, which is the case here !

# If you had more than two conditions for week, I would recommend
# you to stay in linear regression, for the respect of the
# conditions on the error model