benbhansen-stats / propertee

Prognostic Regression Offsets with Propagation of ERrors, for Treatment Effect Estimation (IES R305D210029).
https://benbhansen-stats.github.io/propertee/
Other
2 stars 1 forks source link

warning in when block() in rd_design(): damod_mm longer than w #131

Closed kirkvanacore closed 1 year ago

kirkvanacore commented 1 year ago

@adamSales and I received a warning when requesting the summary of a limit object produced by a blocked rd design. The circumstances are detailed below:

When running this code:

des <- rd_design(Z ~ forcing(R) + unitid(id) + block(problem_id), data=ad[ad$R > -1 & ad$R < 11, ]) 
m1_bw2<-glm(Y ~ R + Z, data = ad[ad$R > -1 & ad$R < 11, ], family = binomial)
res_BW2_1 <- lmitt(Y~1,design=des,offset=cov_adj(m1_bw2), weights = "ate", data=ad[ad$R > -1 & ad$R < 11, ])
summary(res_BW2_1)

...we receive this warning along with the summary:

Warning message: In damod_mm[msk, , drop = FALSE] * w : longer object length is not a multiple of shorter object length

josherrickson commented 1 year ago

I suspect @jwasserman2 may be the better person to help debug this, but do you have missing data? This has been something that's come up occasionally, especially with Adam's real data, that we didn't account for properly.

kirkvanacore commented 1 year ago

There are no missing data for the variables used in this example.

adamSales commented 1 year ago

There may be some blocks with 0 weights, though

On Tue, Jun 27, 2023 at 3:17 PM Kirk Vanacore @.***> wrote:

There are no missing data for the variables used in this example.

— Reply to this email directly, view it on GitHub https://github.com/benbhansen-stats/flexida/issues/131#issuecomment-1610155864, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRHA7BGHJOMBWVMODWDUBDXNM5TXANCNFSM6AAAAAAZWCP5AU . You are receiving this because you were mentioned.Message ID: @.***>

benthestatistician commented 1 year ago

It would be great if you could share either the data. Can you, @kirkvanacore ? Or, better yet, a stripped-down, privacy preserving, data usage ageement compliant version of it that still manifests the warning?

kirkvanacore commented 1 year ago

Here is synthetic data set that produces the error:

synth_dat_issue131.csv

jwasserman2 commented 1 year ago

Thanks for posting the data @kirkvanacore.ate() is returning NA's, causing rows to be dropped. In .get_a21(), this results in w, which is x$weights, to be of a shorter length than damod_mm, which has been created by passing na.pass to model.frame():


> nrow(ad[ad$R > -1 & ad$R < 11, ])
[1] 12283
> nrow(model.frame(res_BW2_1))
[1] 11930
> sum(is.na(ate(des, data = ad[ad$R > -1 & ad$R < 11, ])))
[1] 353
> 12283 - 11930
[1] 353
benthestatistician commented 1 year ago

Thanks, all. I wonder what characterizes the blocks with NA weights?

josherrickson commented 1 year ago

Blocks are numeric through 252, but only 226 exist - e.g. there is no block 41 or 42. When we expand e_z (the block-level ratio of #treated/total num) to the observation level, we use e_z[blocks(design)[, 1]] (R/weights.R#L132). However, if for example we're looking at block 245, e_z[245] returns the 245rd element of a 226-length vector, NA. What we want is e_z["245"], to return the named entry in the vector.

Solution could be as easy as e_z[as.character(blocks(design)[,1])], but I don't have time right now to test it. I'll try and get to it this afternoon if no one else does.

josherrickson commented 1 year ago

Fix was as easy as expected. @kirkvanacore I no longer get the warning with the synthetic data; please test with your real data and let me know.

benthestatistician commented 1 year ago

When you do get to testing this w/ the real data, @kirkvanacore, please also check whether the lmitt(<...>, absorb=T) problem has been fixed as well. Josh E suspects that 5ffed0d4 will have taken care of it.

benthestatistician commented 1 year ago

Hi @kirkvanacore could you check against the real data and verify that you no longer get a warning (or other sign of trouble)? If not, this issue can be closed.

kirkvanacore commented 1 year ago

@benthestatistician @josherrickson My apologies for the delay. I no longer receive the error when running lmitt(<...>, absorb=T) against the real data.

benthestatistician commented 1 year ago

Thanks Kirk!