markmfredrickson / RItools

Randomization inference tools for R
GNU General Public License v2.0
17 stars 11 forks source link

extraneous columns when `makeDesigns` called w/ "`-1`" in formula #88

Closed benthestatistician closed 6 years ago

benthestatistician commented 6 years ago

Here's what we need to fix:

testdata <- nuclearplants
testdata$ne <- as.logical(testdata$ne)
## balanceTest(pr ~ ne + ct + strata(pt)-1, data=testdata)
##                 strata       pt        
##                 stat   std.diff       z
##vars                                    
##neFALSE                   0.0497  0.1445
##neTRUE                   -0.0497 -0.1445
##ct                       -0.2657 -0.5828
##(element weight)             NaN      NA

The rows of the output should be as in this example:

balanceTest(pr ~ ne + ct, data=testdata)
##                 strata  Unstrat       
##                 stat   std.diff      z
##vars                                   
##neTRUE                    -0.166 -0.433
##ct                        -0.311 -0.812
##(element weight)             NaN     NA

The underlying problem lies in makeDesigns:

testdata$W <- 1 # to satisfy `makeDesigns` that there are "weights"
colnames(testdata) <- gsub("W", "(weights)", colnames(testdata))
colnames(RItools:::makeDesigns(pr ~ ne + ct + strata(pt)-1, data=testdata)@Covariates)
## [1] "neFALSE" "neTRUE"  "ct"     
colnames(RItools:::makeDesigns(pr ~ ne + ct , data=testdata)@Covariates)
## [1] "neTRUE" "ct"    
benthestatistician commented 6 years ago

The problem appears to have similar causes to those described in comments on #60, namely processing of exclude-intercept directives given to model.matrix. I thought I eliminated those in addressing that issue, but perhaps I didn't, or perhaps we regressed . Tests introduced in the process of resolving that issue ([master 5ffd819]) may be useful as starting points for regression tests of an eventual fix for this. Those tests were removed in [master 2672306], quite likely mistakenly.

benthestatistician commented 6 years ago

[master 235a1a4] adds some tests, but not corresponding fixes. Tests of the broken functionality are for the moment commented out.

(It turns out that the tests that were earlier removed in [master 2672306] weren't necessarily removed inadvertently, as that commit made them no longer applicable. [master 235a1a4] adapts those previously omitted tests and then restores them.)

benthestatistician commented 6 years ago

A good place to make the fix would be inside of design_matrix() (in file Design.R), close to the spot where intercepts get removed.