[BUGZILLA #1187] anova.glm and explicit contrast matrices

MichaelChirico commented 4 years ago

From: Douglas Bates <bates@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>> anova.glm does not calculate the degrees of freedom properly when an explicit contrast has been set on a factor and the contrast has fewer than (len(levels(thisfactor)) - 1) columns.

load("/p/stat/course/st849-bates/public/slides/figs/src/viet.rda")
fm1 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,

+ family = binomial())

anova(fm1)

Analysis of Deviance Table

Model: binomial, link: logit

Response: accuracy

Terms added sequentially (first to last)

                  Df Deviance Resid. Df Resid. Dev

NULL 5592 4921.9 proficiency 3 69.5 5589 4852.4 task 4 132.9 5585 4719.5 proficiency:task 12 96.2 5573 4623.4 proficiency:learner 2 35.0 5571 4588.4

contrasts(viet$proficiency, 1) <- contrasts(viet$proficiency)
contrasts(viet$proficiency)

.L

l -0.6708204 m -0.2236068 h 0.2236068 a 0.6708204

fm2 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,

+ family = binomial())

anova(fm2)

Analysis of Deviance Table

Model: binomial, link: logit

Response: accuracy

Terms added sequentially (first to last)

                  Df Deviance Resid. Df Resid. Dev

NULL 5592 4921.9 proficiency 3 69.5 5589 4852.4 task 4 132.9 5585 4719.5 proficiency:task 12 96.2 5573 4623.4 proficiency:learner -6 13.9 5579 4609.5

The degrees of freedom for proficiency should be 1, for proficiency:task should be 4 and for proficiency:learner should be 4.

The coefficient count is correct.

--please do not edit the information below--

Version: platform = i386-pc-linux-gnu arch = i386 os = linux-gnu system = i386, linux-gnu status = Under development (unstable) major = 1 minor = 4.0 year = 2001 month = 11 day = 28 language = R

Search Path: .GlobalEnv, package:Devore5, package:ctest, Autoloads, package:base

METADATA

Bug author - Jitterbug compatibility account
Creation time - 2001-11-30 03:13:01 UTC
Bugzilla link
Status - CLOSED FIXED
Alias - None
Component - Models
Version - old
Hardware - All Linux
Importance - P5 normal
Assignee - Jitterbug compatibility account
URL -
Modification time - 2001-11-30 16:53 UTC

MichaelChirico commented 4 years ago

From: Prof Brian Ripley <ripley@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>> Quick workaround:

fm2 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,

family = binomial(), x=TRUE)

anova(fm2)

Analysis of Deviance Table

Model: binomial, link: logit

Response: accuracy

Terms added sequentially (first to last)

                  Df Deviance Resid. Df Resid. Dev

NULL 5592 4921.9 proficiency 1 57.5 5591 4864.4 task 4 131.7 5587 4732.6 proficiency:task 4 58.6 5583 4674.1 proficiency:learner 4 64.6 5579 4609.5

which looks more sensible.

The model matrix is being reconstructed incorrectly, hence the discrepancies. I will look into that later.

On Thu, 29 Nov 2001 bates@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::> wrote:

anova.glm does not calculate the degrees of freedom properly when an
explicit contrast has been set on a factor and the contrast has fewer
than (len(levels(thisfactor)) - 1) columns.

> load("/p/stat/course/st849-bates/public/slides/figs/src/viet.rda")
> fm1 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,
+    family = binomial())
> anova(fm1)
Analysis of Deviance Table

Model: binomial, link: logit

Response: accuracy

Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev
NULL                                   5592     4921.9
proficiency            3     69.5      5589     4852.4
task                   4    132.9      5585     4719.5
proficiency:task      12     96.2      5573     4623.4
proficiency:learner    2     35.0      5571     4588.4
> contrasts(viet$proficiency, 1) <- contrasts(viet$proficiency)
> contrasts(viet$proficiency)
.L
l -0.6708204
m -0.2236068
h  0.2236068
a  0.6708204
> fm2 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,
+   family = binomial())
> anova(fm2)
Analysis of Deviance Table

Model: binomial, link: logit

Response: accuracy

Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev
NULL                                   5592     4921.9
proficiency            3     69.5      5589     4852.4
task                   4    132.9      5585     4719.5
proficiency:task      12     96.2      5573     4623.4
proficiency:learner   -6     13.9      5579     4609.5

The degrees of freedom for proficiency should be 1, for
proficiency:task should be 4 and for proficiency:learner should be 4.

The coefficient count is correct.

--please do not edit the information below--

Version:
platform = i386-pc-linux-gnu
arch = i386
os = linux-gnu
system = i386, linux-gnu
status = Under development (unstable)
major = 1
minor = 4.0
year = 2001
month = 11
day = 28
language = R

Search Path:
.GlobalEnv, package:Devore5, package:ctest, Autoloads, package:base

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>

_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-- Brian D. Ripley, ripley@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ <CENSORING FROM DETECTED PHONE NUMBER ONWARDS; SEE BUGZILLA>

METADATA

Comment author - Jitterbug compatibility account
Timestamp - 2001-11-30 16:04:16 UTC

MichaelChirico commented 4 years ago

From: Prof Brian Ripley <ripley@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>> Now fixed (the bug was in model.matrix.default), and will commit once I can get a clean test running on R-devel.

On Fri, 30 Nov 2001 ripley@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::> wrote:


Quick workaround:

> fm2 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,
family = binomial(), x=TRUE)
> anova(fm2)
Analysis of Deviance Table

Model: binomial, link: logit

Response: accuracy

Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev
NULL                                   5592     4921.9
proficiency            1     57.5      5591     4864.4
task                   4    131.7      5587     4732.6
proficiency:task       4     58.6      5583     4674.1
proficiency:learner    4     64.6      5579     4609.5

which looks more sensible.

The model matrix is being reconstructed incorrectly, hence the
discrepancies. I will look into that later.

On Thu, 29 Nov 2001 bates@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::> wrote:

> anova.glm does not calculate the degrees of freedom properly when an
> explicit contrast has been set on a factor and the contrast has fewer
> than (len(levels(thisfactor)) - 1) columns.
>
> > load("/p/stat/course/st849-bates/public/slides/figs/src/viet.rda")
> > fm1 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,
> +    family = binomial())
> > anova(fm1)
> Analysis of Deviance Table
>
> Model: binomial, link: logit
>
> Response: accuracy
>
> Terms added sequentially (first to last)
>
>
>                       Df Deviance Resid. Df Resid. Dev
> NULL                                   5592     4921.9
> proficiency            3     69.5      5589     4852.4
> task                   4    132.9      5585     4719.5
> proficiency:task      12     96.2      5573     4623.4
> proficiency:learner    2     35.0      5571     4588.4
> > contrasts(viet$proficiency, 1) <- contrasts(viet$proficiency)
> > contrasts(viet$proficiency)
>           .L
> l -0.6708204
> m -0.2236068
> h  0.2236068
> a  0.6708204
> > fm2 <- glm(accuracy ~ proficiency * task + proficiency/learner, viet,
> +   family = binomial())
> > anova(fm2)
> Analysis of Deviance Table
>
> Model: binomial, link: logit
>
> Response: accuracy
>
> Terms added sequentially (first to last)
>
>
>                       Df Deviance Resid. Df Resid. Dev
> NULL                                   5592     4921.9
> proficiency            3     69.5      5589     4852.4
> task                   4    132.9      5585     4719.5
> proficiency:task      12     96.2      5573     4623.4
> proficiency:learner   -6     13.9      5579     4609.5
>
> The degrees of freedom for proficiency should be 1, for
> proficiency:task should be 4 and for proficiency:learner should be 4.
>
> The coefficient count is correct.
>
> --please do not edit the information below--
>
> Version:
>  platform = i386-pc-linux-gnu
>  arch = i386
>  os = linux-gnu
>  system = i386, linux-gnu
>  status = Under development (unstable)
>  major = 1
>  minor = 4.0
>  year = 2001
>  month = 11
>  day = 28
>  language = R
>
> Search Path:
>  .GlobalEnv, package:Devore5, package:ctest, Autoloads, package:base
>
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-devel-request@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>
>
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

--
Brian D. Ripley,                  ripley@<::CENSORED -- SEE ORIGINAL ON BUGZILLA::>
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
<CENSORING FROM DETECTED PHONE NUMBER ONWARDS; SEE BUGZILLA>

-----------

#### METADATA
 - Comment author - Jitterbug compatibility account
 - Timestamp - 2001-11-30 16:53:45 UTC

MichaelChirico commented 4 years ago

NOTES: fixed for 1.4.0: error was in model.matrix.default

METADATA

Comment author - Jitterbug compatibility account
Timestamp - 2001-12-01 13:08:00 UTC

MichaelChirico commented 4 years ago

Audit (from Jitterbug): Sat Dec 1 08:04:32 2001 ripley changed notes Sat Dec 1 08:04:32 2001 ripley foobar Sat Dec 1 08:04:32 2001 ripley moved from incoming to Models-fixed

METADATA

Comment author - Jitterbug compatibility account
Timestamp - 2001-12-01 14:04:32 UTC

MichaelChirico / r-bugs