benbhansen-stats / propertee

Prognostic Regression Offsets with Propagation of ERrors, for Treatment Effect Estimation (IES R305D210029).
https://benbhansen-stats.github.io/propertee/
Other
2 stars 0 forks source link

fixed bug in .make_uoa_cluster_df #173

Closed adamSales closed 6 months ago

adamSales commented 6 months ago

This fixes an error that occurs for tibbles by changing line 302 of R/DesignUtilities.R in .make_uoa_cluster_df()

An MRE:

devtools::load_all('~/flexida')
library(propertee)
library(tibble)

set.seed(1)
dat <- data.frame(y=rnorm(100),z=rbinom(100,1,.5),id=factor(1:100))
tib <- as_tibble(dat)

desDat <- obs_design(z~unitid(id),data=dat)
desTib <- obs_design(z~unitid(id),data=tib)

With a data.frame, it works fine:

> summary(estDat <- lmitt(y~1,design=desDat,data=dat))

Call:
lmitt(y ~ 1, design = desDat, data = dat)

 Treatment Effects :
   Estimate Std. Error t value Pr(>|t|)
z.  0.01889    0.18954     0.1    0.921
Std. Error calculated via type "CR0"

With tibble there is an error:

> summary(estTib <- lmitt(y~1,design=desTib,data=tib))
Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column

The problem comes from this line:

  q_df <- q_df[, c(uoa_cols, cluster)), drop = FALSE]

When passed a tibble:

> debugonce(.make_uoa_cluster_df)
> summary(estTib <- lmitt(y~1,design=desTib,data=tib))

...

debug at  /R/DesignUtilities.R#293: q_df <- q_df[, c(uoa_cols, cluster), drop = FALSE]
Browse[1]> uoa_cols
[1] "id"
Browse[1]> cluster
[1] "id"
Browse[1]> 
Browse[1]> names(q_df)
[1] "id" "id"

leading to the error.

When passed a data.frame instead, R corrects the problem:

> debugonce(.make_uoa_cluster_df)
> summary(estDat <- lmitt(y~1,design=desDat,data=dat))

...

Browse[1]> names(q_df)
[1] "id"   "id.1"

Changing

q_df <- q_df[, c(uoa_cols, cluster), drop = FALSE]

to

q_df <- q_df[, unique(c(uoa_cols, cluster)), drop = FALSE]

solves the problem:

> summary(estTib <- lmitt(y~1,design=desTib,data=tib))

Call:
lmitt(y ~ 1, design = desTib, data = tib)

 Treatment Effects :
   Estimate Std. Error t value Pr(>|t|)
z.  0.01889    0.18954     0.1    0.921
Std. Error calculated via type "CR0"
jwasserman2 commented 6 months ago

Although I'll let @josherrickson suggest how to handle merges with the repo now public i.e. what should the DESCRIPTION be updated to