Closed YutingYale closed 1 year ago
I am having the same issue for panel data, so is one of my students-- the covariates are time invariant for each unit without NA values but we still get this error :( Any ideas for fixes?
@YutingYale This error message is due to some of the covariates being perfectly collinear. You can try dropping some of the covariates.
@shrabasteebanerjee adjusting for baseline covariates is the default behavior of our approach, so, in principle, this should work off the shelf. If you are getting the same error message, it is the same issue: some of the covariates are perfectly collinear.
Also, all of our estimates are done at the group-time level (to allow for treatment effect heterogeneity), so these could be "local" versions of perfect collinearity
Hope this helps! Brant
@YutingYale This error message is due to some of the covariates being perfectly collinear. You can try dropping some of the covariates.
@shrabasteebanerjee adjusting for baseline covariates is the default behavior of our approach, so, in principle, this should work off the shelf. If you are getting the same error message, it is the same issue: some of the covariates are perfectly collinear.
Also, all of our estimates are done at the group-time level (to allow for treatment effect heterogeneity), so these could be "local" versions of perfect collinearity
Hope this helps! Brant
Thank you so much for your response. I really appreciate it. However, I have checked the correlation between the covariates, and they are not perfectly colinear (the highest correlation coefficient is 0.6 between some race variables). Would you mind advising?
Yes, do you mind pasting in the code that you ran here?
Yes, do you mind pasting in the code that you ran here?
Sure. Please see the code below: attgt_all <-att_gt(yname="anyhelp", tname="year", gname="first_treat", xformla= ~ age + edu + male + married + childnumber + non_hispanicwhite + non_hispanicblack + non_hispanicother, data=data, panel=F, clustervars=id,control_group="notyettreated")
The error shows up when I add non_hispanicother. There is no perfect linearity issue in the overall data. But this dummy (non_hispanicother) equals 0 in one particular treated group(first_treat==2000 group) in a few years. Would this be an issue?
Thank you so much in advance for your help with the issue!
Yes, that could be an issue because if non_hispanicother=1 in the subset of units with first_treat==2000 and those not-yet-treated by time t, I could tell with certainty that the unit would be in the control group. It may also be the case that, in some subset for other units, being non_hispanicother=1 perfectly predicts being in the treated group.
Does this make sense?
Got it. This makes sense. I really appreciate your clarification on this issue. Thanks again!
@YutingYale could you please share what you did after that?
@YutingYale could you please share what you did after that?
I dropped the covariates that may cause the collinearity issue.
I was wondering if anyone has used the Santanna did package to adjust for baseline covariates in the repeated cross-sectional setting? I got the error "Error in qr.solve((crossprod(wold.x.post.treat,int.cov)/n): singular matrix 'a' in solve". Is it because of the empirical overlap issue? In addition, it seems infeasible to adjust for state-fixed effect in the repeated cross-sectional setting in the package?