bcallaway11 / did

Difference in Differences with Multiple Periods, website: https://bcallaway11.github.io/did
288 stars 92 forks source link

Errors when Adding Matching Variables #98

Closed adebiasi21 closed 2 years ago

adebiasi21 commented 2 years ago

I run att_gt with several matching variables, all of which are dichotomous with the exception of length_ln:

gt_1 <- att_gt(yname = "crime", tname = "new_qrtr", idname = "parcelid", gname = "new_group", xformla = ~ tpc_binary + residential + commercial + industrial + other + length_ln, data = mydata0, print_details=TRUE, clustervars = "neighbor10", est_method = "dr")

I am notified that There were 50 or more warnings (use warnings() to see the first 50), all of which relate to there not being enough control units in group 2 and 3 for select time periods.

I then run summary(gt_1) and receive the following error:

Group-Time Average Treatment Effects: Error in Math.data.frame(list(mpobj$group= c(2, 2, 2, 2, 2, 2, 2, 2, : non-numeric variable(s) in data frame: mpobj$att

I attempt to run aggte to get group-specific treatment and dynamic effects. The command fails to run. I receive two errors:

gse <- aggte(gt_1, type = "group", clustervars = "neighbor10", na.rum=TRUE) Error in glist[gnotna] : invalid subscript type 'list'

dyn <- aggte(gt_1, type = "dynamic", clustervars = "neighbor10", na.rm=TRUE) Error in max(t) : invalid 'type' (list) of argument

How should I interpret these errors? And, are there workarounds?

bcallaway11 commented 2 years ago

The later errors are likely due to the errors from the first step, and I'm not 100% sure what is going wrong there. Here is one guess though:

How big is your group of untreated units? If it is relatively small, then I think it could possibly lead to this sort of issue. If you feel comfortable using units that are not treated yet in the comparison group, then you could set control_group="notyettreated" inside the call to att_gt. That might fix this issue.

adebiasi21 commented 2 years ago

As always, I appreciate your quick response.

Below, new_qrtr is the time unit ( quarter 1 - quarter 32) and post is a dichotomous variable that captures units that receive treatment (1) in every quarter. As you can see, my pool of untreated units (0) for each quarter is quite large. Given this, I didn't expect to have any issues with finding good comparison groups.

I'll try setting control_group = "notyetreated" and see what happens. I've done this in the past - albeit with a somewhat different set of controls - and was not successful. I'll give it another go. I appreciate the suggestion!

mytable <- xtabs(~ post + new_qrtr, data=mydata0) ftable(mytable)

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

new_qrtr | post = 0 | post = 1 -- | -- | -- 1 | 97299 | 0 2 | 97030 | 269 3 | 96819 | 480 4 | 96545 | 754 5 | 96423 | 876 6 | 96307 | 992 7 | 96137 | 1162 8 | 95815 | 1484 9 | 95543 | 1756 10 | 95320 | 1979 11 | 95090 | 2209 12 | 94761 | 2538 13 | 94474 | 2825 14 | 94146 | 3153 15 | 93834 | 3465 16 | 93579 | 3720 17 | 93466 | 3833 18 | 93305 | 3994 19 | 93172 | 4127 20 | 93021 | 4278 21 | 92883 | 4416 22 | 92679 | 4620 23 | 92502 | 4797 24 | 92299 | 5000 25 | 92136 | 5163 26 | 91978 | 5321 27 | 91762 | 5537 28 | 91605 | 5694 29 | 91397 | 5902 30 | 91160 | 6139 31 | 90892 | 6407 32 | 90653 | 6646

bcallaway11 commented 2 years ago

Yes, given your response, I don't think that my first answer was correct.

adebiasi21 commented 2 years ago

As you can tell, I am working with a rather large dataset. Without knowing much, is it possible that this issue has to do with matrix size? Anyways, I'll try running it again control_group = "notyetreated" and see what happens. The program takes a few days to run - even when using a cluster.

bcallaway11 commented 2 years ago

If you subset your data (say, dropping a large fraction of "ids" and time periods), do you still get the same issue?

Another idea would be to try changing est_method to "reg"

I was trying to come up with a way to get the same error and couldn't come up with anything. If you send me some data that can reproduce this error, I'll look into it.

adebiasi21 commented 2 years ago

Thanks for your generous offer to take a look at the data. I'll take you up on it! I'll reach out to you via email once I've received permission on my end to share the data.

bcallaway11 commented 2 years ago

Note to self: everything appears to be working here now.