DoseResponse / drc

Fitting dose-response models in R
https://doseresponse.github.io/drc/
21 stars 16 forks source link

mselect results not consistent with modelFit - LOF only 0 #31

Open hreinwald opened 1 year ago

hreinwald commented 1 year ago

Dear drc-Team,

for some reason, mselect() only returns 0 values for the LOF (Lack of fit) when comparing multiple models. However, when I am running modelFit() I get p values clearly above the sign. threshold of 0.05.

To reproduce the error I used the ryegrass dataset as shown below:

> # Fitting a four-parameter log-logistic model
> ryegrass.LL.4 <- drm(rootl ~ conc, data = ryegrass, fct = LL.4())
> modelFit(ryegrass.LL.4) # LOF pvalue of 0.9451 <- this is NOT identical with the results of mselect
Lack-of-fit test

          ModelDf    RSS Df F value p value
ANOVA          17 5.1799                   
DRC model      20 5.4002  3  0.2411  0.8665

> AIC(ryegrass.LL.4) # IC = 41.82703 <- this is identical with the results of mselect
[1] 42.31029

> m.ls = list(LL.5(),LL.3(),W2.4(),W1.4())
> mselect(ryegrass.LL.4, m.ls)
        logLik       IC Lack of fit   Res var
W2.4 -15.91352 41.82703           0 0.2646283
LL.4 -16.15514 42.31029           0 0.2700107
LL.5 -15.87828 43.75656           0 0.2777393
W1.4 -17.46720 44.93439           0 0.3012075
LL.3 -18.60413 45.20827           0 0.3153724

From my understanding of the mselect() and modelFit() function the computation of the Lack of fit values is based on a more general ANOVA model. Hence it should return similar p-values, shouldn't it?

I am using drc package version 3.2.0 and R version 4.2.1 on a Windows System.

OnofriAndreaPG commented 1 year ago

With mselect() you can get LOFs only if you add the argument 'nested = T'. See the code below:

> ryegrass.LL.4 <- drm(rootl ~ conc, data = ryegrass, fct = LL.4())
> modelFit(ryegrass.LL.4) 
Lack-of-fit test

          ModelDf    RSS Df F value p value
ANOVA          17 5.1799                   
DRC model      20 5.4002  3  0.2411  0.8665
> summary(ryegrass.LL.4)$resVar 
[1] 0.2700107
> AIC(ryegrass.LL.4)
[1] 42.31029
> logLik(ryegrass.LL.4)
'log Lik.' -16.15514 (df=5)
> 
> mselect(ryegrass.LL.4, list(LL.5(),LL.3(),W2.4(),W1.4()), 
+         nested = T)
        logLik       IC Lack of fit   Res var Nested F test
W2.4 -15.91352 41.82703   0.9450713 0.2646283    0.03645316
LL.4 -16.15514 42.31029   0.8664830 0.2700107            NA
LL.5 -15.87828 43.75656   0.8538476 0.2777393    0.51346021
W1.4 -17.46720 44.93439   0.4505676 0.3012075           NaN
LL.3 -18.60413 45.20827   0.3531679 0.3153724    0.11555973
Warning message:
In pf(testStat, dfDiff[2], df2) : NaNs produced

Considering the output, the LOFs are OK, while it is more difficult to understand what the nested F tests are. Indeed, nested F tests make sense only for the LL.3() and LL.5() models, which are, indeed, nested with LL.4() (I mean, LL3 is nested within LL.4 and LL.4 is nested within LL.5). Look at the following code:

> ryegrass.LL.3 <- drm(rootl ~ conc, data = ryegrass, fct = LL.3())
> ryegrass.LL.4 <- drm(rootl ~ conc, data = ryegrass, fct = LL.4())
> ryegrass.LL.5 <- drm(rootl ~ conc, data = ryegrass, fct = LL.5())
> anova(ryegrass.LL.4, ryegrass.LL.5)

1st model
 fct:      LL.4()
2nd model
 fct:      LL.5()

ANOVA table

          ModelDf    RSS Df F value p value
1st model      20 5.4002                   
2nd model      19 5.2770  1  0.4435  0.5135
> anova(ryegrass.LL.5, ryegrass.LL.3)

1st model
 fct:      LL.3()
2nd model
 fct:      LL.5()

ANOVA table

          ModelDf    RSS Df F value p value
2nd model      21 6.6228                   
1st model      19 5.2770  2  2.4227  0.1156
> anova(ryegrass.LL.3, ryegrass.LL.4)

1st model
 fct:      LL.3()
2nd model
 fct:      LL.4()

ANOVA table

          ModelDf    RSS Df F value p value
1st model      21 6.6228                   
2nd model      20 5.4002  1   4.528   0.046

I do not know what the nested F test is for the model W2.4(), which is not nested with any other models...