insongkim / wfe

20 stars 4 forks source link

bug when factor labels for time variable are not integers #6

Open jaspercooper opened 5 years ago

jaspercooper commented 5 years ago

Hi @insongkim, thanks so much for this amazing package and contribution!

I was playing around with the package for a project of mine, in which the time variable is a factor derived from a date. So all the factor labels are dates, not integers.

It turns out this causes a bug that I've tracked down. Should be an easy fix I think.

See here for a minimum working example using the third example (observational 2-way DiD) from the ?wfe documentation:

# Works fine:
mod.did <- wfe(y~ tr+x1+x2, data = Data.obs, treat = "tr",
                             unit.index = "unit", time.index = "time", method = "unit",
                             qoi = "ate", estimator ="did", hetero.se=TRUE, auto.se=TRUE,
                             White = TRUE, White.alpha = 0.05, verbose = TRUE)

# Change factor labels to non-integers
Data.obs$time <- factor(letters[Data.obs$time])

# breaks:
mod.did <- wfe(y~ tr+x1+x2, data = Data.obs, treat = "tr",
                             unit.index = "unit", time.index = "time", method = "unit",
                             qoi = "ate", estimator ="did", hetero.se=TRUE, auto.se=TRUE,
                             White = TRUE, White.alpha = 0.05, verbose = TRUE)

The error occurs here, lines 466-473 of the script with wfe() in the version currently on CRAN:

idx <- paste(orig.unit.idx, orig.time.idx, sep = "_")
a <- units
b <- times
Wv <- as.vector(W)
a1 <- rep(as.character(a$unit), each = nrow(W))
b1 <- rep(as.character(b$time), ncol(W))
idxall <- paste(a1, b1, sep = "_")
idxall.sub <- idxall[which(idxall %in% idx)]

The last line is valued zero because idxall assumes integers for the date variable, and therefore finds no matches with idx.

Hope this is helpful for future reference!