Closed grantmcdermott closed 1 year ago
Frankly, I don't really understand what is going on, and I don't have time for a deep dive in the code and blog post now. Maybe you can give me a numbered list of the post-processing steps that we want to apply to pmod
?
Totally. Again, sorry for tagging and running earlier. No rush on this; I should keep digging on my own.
Here's what emfx(pmod, type = "event")
is doing underneath the hood:
pmod
estimation. It then subsets this dataset to rows where ".Dreat==1" (i.e. treatment was switched on) and adds an "event" column (i.e., year minus first year of treatment).marginaleffects()
, highlighting the ".Dtreat" variable by "event".The manual version of emfx(pmod, type = "event")
is thus:
library(marginaleffects)
mfx = marginaleffects(
pmod,
variable = ".Dtreat",
by = "event",
newdata = within(
subset(fixest:::fetch_data(pmod), .Dtreat==1),
event <- year - first.treat
)
)
summary(mfx)
#> Term Contrast event Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
#> 1 .Dtreat mean(dY/dX) 0 -22.849 16.17 -1.41348 0.15751423 -54.53 8.834
#> 2 .Dtreat mean(dY/dX) 1 3.654 41.81 0.08739 0.93035790 -78.30 85.607
#> 3 .Dtreat mean(dY/dX) 2 -71.521 22.06 -3.24264 0.00118430 -114.75 -28.291
#> 4 .Dtreat mean(dY/dX) 3 -97.798 25.45 -3.84321 0.00012144 -147.67 -47.923
#>
#> Model type: etwfe
#> Prediction type: response
Now I should clarify that I believe that this is the same thing that jwdid -> estat
is doing in Stata. At least, we get the exact same results across R and Stata with these steps for linear models. (The README provides an example of this.) But as you can see below, we get different results from the equivalent Poisson model in Stata. Again, note that the actual Poisson regressions are identical (etwfe
== jwdid
). It's only with the emfx
/ estat
post-estimation where things diverge.
use https://friosavila.github.io/playingwithstata/drdid/mpdta.dta, clear
gen emp=exp(lemp)
qui jwdid emp lpop, ivar(countyreal) tvar(year) gvar(first_treat) method(poisson)
estat event
[omitting what you dont want to see]
---------------------------------------------------------------
| Delta-method
| Contrast std. err. [95% conf. interval]
--------------+------------------------------------------------
_at@__event__ |
(2 vs 1) 0 | -37.35025 15.75917 -68.23764 -6.462846
(2 vs 1) 1 | -117.3918 33.88822 -183.8115 -50.97213
(2 vs 1) 2 | -204.6293 39.72574 -282.4903 -126.7683
(2 vs 1) 3 | -182.7539 39.14667 -259.48 -106.0278
---------------------------------------------------------------
@friosavila, are you able to confirm the exact margins
command that is being run with estat event
here? Many thanks.
So, what I do in Stata is 1) just like you did, create the “event” variable 2) make predictions for the treated sample only, separated for each “event” group. 3) the predictions are done either using Treatment =1 it Treat =0
Margins does the rest (in terms of SE)
Here is the manual version:
ssc install frause
frause mpdta, clear
gen emp=exp(lemp)
jwdid emp lpop, ivar(countyreal) tvar(year) gvar(first_treat) method(poisson)
replace __tr__=0
predict yhat0
replace __tr__=1
predict yhat1
gen att=yhat1-yhat0
gen event = year - first_treat if first!=0
tabstat att, by(event) event | Mean
---------+----------
-4 | 0
-3 | 0
-2 | 0
-1 | 0
0 | -37.35025
1 | -117.3918
2 | -204.6293
3 | -182.754
---------+----------
Total | -22.95819
--------------------
This is extremely weird, but I don’t think the problem is related to marginaleffects
. Here’s an attempt to replicate the “manual” Stata calculations using purely base R
code and the fixest:::predict()
method:
library(etwfe)
data("mpdta", package = "did")
mpdta$emp = exp(mpdta$lemp)
pmod = etwfe(
emp ~ lpop,
tvar = year, gvar = first.treat, gref = 0, data = mpdta,
vcov = ~countyreal, family = "poisson", fe = "none")
nd <- fixest:::fetch_data(pmod) |>
subset(.Dtreat == 1) |>
transform(event = year - first.treat)
p <- data.frame(
p1 = predict(pmod, transform(nd, .Dtreat = 1)),
p0 = predict(pmod, transform(nd, .Dtreat = 0)),
event = nd$year - nd$first.treat) |>
transform(att = p1 - p0)
aggregate(att ~ event, FUN = mean, data = p)
#> event att
#> 1 0 -25.349748
#> 2 1 1.091751
#> 3 2 -75.124632
#> 4 3 -101.823979
Since the base R
code doesn’t replicate Stata, I think we have to conclude that this is either a difference in fixest:::predict()
or in the creation of the event
variable.
For completeness, note that you can replicate the results above with marginaleffects
:
library(marginaleffects)
comparisons(
pmod,
variables = ".Dtreat",
newdata = nd,
by = "event") |>
summary()
#> Term Contrast event Effect Std. Error z value Pr(>|z|)
#> 1 .Dtreat mean(1) - mean(0) 0 -25.350 15.89 -1.59556 0.1105874
#> 2 .Dtreat mean(1) - mean(0) 1 1.092 40.30 0.02709 0.9783867
#> 3 .Dtreat mean(1) - mean(0) 2 -75.125 23.16 -3.24441 0.0011769
#> 4 .Dtreat mean(1) - mean(0) 3 -101.824 27.09 -3.75899 0.0001706
#> 2.5 % 97.5 %
#> 1 -56.49 5.79
#> 2 -77.89 80.08
#> 3 -120.51 -29.74
#> 4 -154.92 -48.73
#>
#> Model type: etwfe
#> Prediction type: response
If you show me base R
code that does what you want to do, I can probably show you an easier way to achieve the same result with marginaleffects
(with standard errors). But I’d need to know exactly how we can replicate the Stata code in @friosavila’s comment.
Thanks both!
I think we're getting closer. I'll try to investigate more during the week. But I just quickly wanted to show that I get the same answer using base R glm
as with fixest
(as per Vincent's inquiry). The difference is probably some difference in the underlying defaults of R and Stata here, rather than a specific package.
data("mpdta", package = "did")
mpdta$emp = exp(mpdta$lemp)
mpdta = mpdta |>
within({
.Dtreat = as.integer(year >= first.treat & first.treat != 0)
lpop_dm = ave(lpop, first.treat, year, FUN = function(x) x - mean(x, na.rm=TRUE))
})
# head(mpdta)
# Base R equivalent of etwfe...
pmod_base = glm(
emp ~ .Dtreat:as.factor(first.treat):as.factor(year) / lpop_dm +
lpop + as.factor(first.treat):lpop + as.factor(year):lpop +
as.factor(first.treat) + as.factor(year),
family = "poisson", dat = mpdta
)
# summary(pmod_base)
nd <- mpdta |>
subset(.Dtreat == 1) |>
transform(event = year - first.treat)
p <- data.frame(
p1 = predict(pmod_base, transform(nd, .Dtreat = 1), type = "response"),
p0 = predict(pmod_base, transform(nd, .Dtreat = 0), type = "response"),
event = nd$year - nd$first.treat) |>
transform(att = p1 - p0)
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :
#> prediction from a rank-deficient fit may be misleading
#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :
#> prediction from a rank-deficient fit may be misleading
aggregate(att ~ event, FUN = mean, data = p)
#> event att
#> 1 0 -25.349748
#> 2 1 1.091751
#> 3 2 -75.124632
#> 4 3 -101.823979
library(marginaleffects)
comparisons(
pmod_base,
variables = ".Dtreat",
newdata = nd,
by = "event") |>
summary()
#> Warning: Model matrix is rank deficient. Some variance-covariance parameters are
#> missing.
#> Term Contrast event Effect Std. Error z value Pr(>|z|)
#> 1 .Dtreat mean(1) - mean(0) 0 -25.350 3.448 -7.3529 1.9391e-13
#> 2 .Dtreat mean(1) - mean(0) 1 1.092 7.081 0.1542 0.87747
#> 3 .Dtreat mean(1) - mean(0) 2 -75.125 12.553 -5.9844 2.1717e-09
#> 4 .Dtreat mean(1) - mean(0) 3 -101.824 13.014 -7.8244 5.1024e-15
#> 2.5 % 97.5 %
#> 1 -32.11 -18.59
#> 2 -12.79 14.97
#> 3 -99.73 -50.52
#> 4 -127.33 -76.32
#>
#> Model type: glm
#> Prediction type: response
Created on 2022-10-16 with reprex v2.0.2
Hi there, I went into the rabbit hole. Not sure why, but wherever you predict values with .Dtreat=0, we obtain things that are quite different. And I'm not sure why. consider countyreal = 8001 year 2007. With Dtreat=0, Stata predicts an outcome of 5081.009. R, instead predicts 5431.64298. Im not sure about why this difference arises. My main guess is that in R's version, there are many unestimated coefficients. In my first implementations of JWDID, I was having problems with margins understanding that, so I ended up collecting all relevant coefficients, rather than letting Stata make all interactions. I guess the only way to figure this out is to reduce the problem, and estimate the predicted values manually.
Best wishes Fernando
On Mon, Oct 17, 2022 at 2:12 AM Grant McDermott @.***> wrote:
Thanks both!
I think we're getting closer. I'll try to investigate more during the week. But I just quickly wanted to show that I get the same answer using base R glm as with fixest (as per Vincent's inquiry). The difference is probably some difference in the underlying defaults of R and Stata here, rather than a specific package.
data("mpdta", package = "did")mpdta$emp = exp(mpdta$lemp) mpdta = mpdta |> within({ .Dtreat = as.integer(year >= first.treat & first.treat != 0) lpop_dm = ave(lpop, first.treat, year, FUN = function(x) x - mean(x, na.rm=TRUE)) })
head(mpdta)
Base R equivalent of etwfe...pmod_base = glm(
emp ~ .Dtreat:as.factor(first.treat):as.factor(year) / lpop_dm + lpop + as.factor(first.treat):lpop + as.factor(year):lpop + as.factor(first.treat) + as.factor(year), family = "poisson", dat = mpdta
)# summary(pmod_base) nd <- mpdta |> subset(.Dtreat == 1) |> transform(event = year - first.treat) p <- data.frame( p1 = predict(pmod_base, transform(nd, .Dtreat = 1), type = "response"), p0 = predict(pmod_base, transform(nd, .Dtreat = 0), type = "response"), event = nd$year - nd$first.treat) |> transform(att = p1 - p0)#> Warning in predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :#> prediction from a rank-deficient fit may be misleading
> Warning in predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :#> prediction from a rank-deficient fit may be misleading
aggregate(att ~ event, FUN = mean, data = p)#> event att#> 1 0 -25.349748#> 2 1 1.091751#> 3 2 -75.124632#> 4 3 -101.823979
library(marginaleffects) comparisons( pmod_base, variables = ".Dtreat", newdata = nd, by = "event") |> summary()#> Warning: Model matrix is rank deficient. Some variance-covariance parameters are#> missing.#> Term Contrast event Effect Std. Error z value Pr(>|z|)#> 1 .Dtreat mean(1) - mean(0) 0 -25.350 3.448 -7.3529 1.9391e-13#> 2 .Dtreat mean(1) - mean(0) 1 1.092 7.081 0.1542 0.87747#> 3 .Dtreat mean(1) - mean(0) 2 -75.125 12.553 -5.9844 2.1717e-09#> 4 .Dtreat mean(1) - mean(0) 3 -101.824 13.014 -7.8244 5.1024e-15#> 2.5 % 97.5 %#> 1 -32.11 -18.59#> 2 -12.79 14.97#> 3 -99.73 -50.52#> 4 -127.33 -76.32#> #> Model type: glm #> Prediction type: response
Created on 2022-10-16 with reprex v2.0.2 https://reprex.tidyverse.org
— Reply to this email directly, view it on GitHub https://github.com/grantmcdermott/etwfe/issues/4#issuecomment-1280338664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASZKKFRKX3H335XZQAENKO3WDTU3RANCNFSM6AAAAAARGNR464 . You are receiving this because you were mentioned.Message ID: @.***>
Ah, that's very helpful. Thanks for the sleuthing @friosavila!
Just confirming visually what you're saying: The only differences occur with .Dtreat==0.
The highlighted differences, in red, are at an arbitrarily small cut-off... and even the largest differences are pretty small on log scale (the original scale of the data).
It's not clear to me that either one of R or Stata is obviously (in)correct here. If I collect the predictions from R and Stata and compare against the true emp
(e.g. for yhat0
where .Dtreat==0
) R has a slightly lower MAE and median absolute prediction error. But again the differences are very small (less than half a percent).
I'll leave this issue open and continue digging. But it might ultimately be a convergence (or even rounding) issue that's hard to pin down and not worth stressing too much about.
Given the warning above, my best guess is that the design matrix is rank deficient, and that in order to compute predictions, R
and Stata
both drop a somewhat arbitrary set of columns. If the columns that R
and Stata
happen to drop are not exactly the same, maybe this could give rise to some numerical issues and minor discrepancies.
For reference, in R
this is done by the drop()
call on line 686 inside predict.lm()
:
https://github.com/SurajGupta/r-source/blob/master/src/library/stats/R/lm.R#L679
Given the warning above, my best guess is that the design matrix is rank deficient, and that in order to compute predictions, R and Stata both drop a somewhat arbitrary set of columns.
Thanks @vincentarelbundock. I think this is the same point @friosavila was making earlier in the thread. For the record, though I don't think rank deficiency is the issue. fixest (what etwfe uses underneath the hood) drops these collinear terms at run time prior to estimation. We can also confirm the same results with base glm
if we construct the dataset and remove all the collinear terms ahead of time.
Show "preconstructed" results
Ok it took a while of looking at the code! The source of the difference comes from how things are being interacted.
consider the Stata code
qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
asd
On Mon, Nov 28, 2022 at 6:31 PM Grant McDermott @.***> wrote:
Given the warning above, my best guess is that the design matrix is rank deficient, and that in order to compute predictions, R and Stata both drop a somewhat arbitrary set of columns.
Thanks @vincentarelbundock https://github.com/vincentarelbundock. I think this is the same point @friosavila https://github.com/friosavila was making earlier in the thread. For the record, though I don't think rank deficiency is the issue. fixest (what etwfe uses underneath the hood) drops these collinear terms at run time. We can also confirm the same results with base glm if we construct the dataset and (non-colinear) interactions ahead of time. Show "preconstructed" results
data("mpdta", package = "did") dat = mpdta |> transform( emp = exp(lemp), event = ifelse(first.treat != 0, year - first.treat, NA), .Dtreat = as.integer(year >= first.treat & first.treat != 0), lpop_dm = ave(lpop, first.treat, year, FUN = function(x) x - mean(x, na.rm=TRUE)), first.treat = factor(first.treat), year = factor(year) ) gmod = glm( emp ~ .Dtreat:first.treat:year / lpop_dm + lpop * (first.treat + year), family = 'poisson', data = dat )
get rid of NA (collinear) coefs and re-estimate ----
coefs = coefficients(gmod)coefs = names(coefs[which(!is.na(coefs))]) mm = model.matrix(gmod)mm = mm[, coefs] dat2 = as.data.frame(cbind(emp = dat$emp, mm[, -1])) gmod2 = glm( emp ~ ., family = "poisson", data = dat2 )# summary(gmod2)
construct counterfactual datasetsdat2_0 = dat2_1 = dat2Dcols = grep("Dtreat", colnames(dat2))dat2_0[, Dcols] = 0# dat2_1[, Dcols] = 1 # already true since .Dtreat = 1 by definition
get ATTs ----
dat = dat |> transform( yhat1 = predict(gmod2, newdata = dat2_1, type = "response"), yhat0 = predict(gmod2, newdata = dat2_0, type = "response") ) |> transform( att = yhat1 - yhat0 )
aggregate(att ~ event, FUN = mean, data = dat)#> event att#> 1 -4 0.000000#> 2 -3 0.000000#> 3 -2 0.000000#> 4 -1 0.000000#> 5 0 -25.349748#> 6 1 1.091751#> 7 2 -75.124632#> 8 3 -101.823979
Created on 2022-11-28 with reprex v2.0.2 https://reprex.tidyverse.org
All of this prompted me to try implementing the same, manual version of the ETWFE regression in Stata. Here (finally) I think we're getting closer to underlying issue. As you can see, I get exactly the same ATT results from this manual Stata version as I do in R.
. frause mpdta, clear (Written by R. )
. gen emp = exp(lemp)
. qui jwdid emp lpop, ivar(countyreal) tvar(year) gvar(first_treat) method(poisson)
. qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
. replace tr = 1 (2,209 real changes made)
. predict yhat1 (option n assumed; predicted number of events)
. replace tr = 0 (2,500 real changes made)
. predict yhat0 (option n assumed; predicted number of events)
. gen att=yhat1-yhat0
. gen event = year - first_treat if first!=0 (1,545 missing values generated)
. tabstat att, by(event)
Summary for variables: att Group variable: event
event | Mean---------+---------- -4 | 0 -3 | 0 -2 | 0 -1 | 0 0 | -25.34976 1 | 1.091738 2 | -75.12462 3 | -101.824---------+---------- Total | -8.707092--------------------
@friosavila https://github.com/friosavila Do you have any thoughts on this? Does the manual Stata version above correspond to what jwdid is doing underneath the hood?
— Reply to this email directly, view it on GitHub https://github.com/grantmcdermott/etwfe/issues/4#issuecomment-1329883375, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASZKKFUTXGWOELTKY6AXEI3WKU6DXANCNFSM6AAAAAARGNR464 . You are receiving this because you were mentioned.Message ID: @.***>
Again, consider the code qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
The second line is being interacted with c.tr, however the way I programmed it in jwdid I do not interact that second line with the treatment. Right now I'm not sure which one is the correct approach, So I will go back to Wooldridge paper to see what he suggests.
At the very least, we know now the source of the difference! Fernando
On Mon, Nov 28, 2022 at 9:52 PM Fernando Rios-Avila @.***> wrote:
Ok it took a while of looking at the code! The source of the difference comes from how things are being interacted.
consider the Stata code
qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
asd
On Mon, Nov 28, 2022 at 6:31 PM Grant McDermott @.***> wrote:
Given the warning above, my best guess is that the design matrix is rank deficient, and that in order to compute predictions, R and Stata both drop a somewhat arbitrary set of columns.
Thanks @vincentarelbundock https://github.com/vincentarelbundock. I think this is the same point @friosavila https://github.com/friosavila was making earlier in the thread. For the record, though I don't think rank deficiency is the issue. fixest (what etwfe uses underneath the hood) drops these collinear terms at run time. We can also confirm the same results with base glm if we construct the dataset and (non-colinear) interactions ahead of time. Show "preconstructed" results
data("mpdta", package = "did") dat = mpdta |> transform( emp = exp(lemp), event = ifelse(first.treat != 0, year - first.treat, NA), .Dtreat = as.integer(year >= first.treat & first.treat != 0), lpop_dm = ave(lpop, first.treat, year, FUN = function(x) x - mean(x, na.rm=TRUE)), first.treat = factor(first.treat), year = factor(year) ) gmod = glm( emp ~ .Dtreat:first.treat:year / lpop_dm + lpop * (first.treat + year), family = 'poisson', data = dat )
get rid of NA (collinear) coefs and re-estimate ----
coefs = coefficients(gmod)coefs = names(coefs[which(!is.na(coefs))]) mm = model.matrix(gmod)mm = mm[, coefs] dat2 = as.data.frame(cbind(emp = dat$emp, mm[, -1])) gmod2 = glm( emp ~ ., family = "poisson", data = dat2 )# summary(gmod2)
construct counterfactual datasetsdat2_0 = dat2_1 = dat2Dcols = grep("Dtreat", colnames(dat2))dat2_0[, Dcols] = 0# dat2_1[, Dcols] = 1 # already true since .Dtreat = 1 by definition
get ATTs ----
dat = dat |> transform( yhat1 = predict(gmod2, newdata = dat2_1, type = "response"), yhat0 = predict(gmod2, newdata = dat2_0, type = "response") ) |> transform( att = yhat1 - yhat0 )
aggregate(att ~ event, FUN = mean, data = dat)#> event att#> 1 -4 0.000000#> 2 -3 0.000000#> 3 -2 0.000000#> 4 -1 0.000000#> 5 0 -25.349748#> 6 1 1.091751#> 7 2 -75.124632#> 8 3 -101.823979
Created on 2022-11-28 with reprex v2.0.2 https://reprex.tidyverse.org
All of this prompted me to try implementing the same, manual version of the ETWFE regression in Stata. Here (finally) I think we're getting closer to underlying issue. As you can see, I get exactly the same ATT results from this manual Stata version as I do in R.
. frause mpdta, clear (Written by R. )
. gen emp = exp(lemp)
. qui jwdid emp lpop, ivar(countyreal) tvar(year) gvar(first_treat) method(poisson)
. qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
. replace tr = 1 (2,209 real changes made)
. predict yhat1 (option n assumed; predicted number of events)
. replace tr = 0 (2,500 real changes made)
. predict yhat0 (option n assumed; predicted number of events)
. gen att=yhat1-yhat0
. gen event = year - first_treat if first!=0 (1,545 missing values generated)
. tabstat att, by(event)
Summary for variables: att Group variable: event
event | Mean---------+---------- -4 | 0 -3 | 0 -2 | 0 -1 | 0 0 | -25.34976 1 | 1.091738 2 | -75.12462 3 | -101.824---------+---------- Total | -8.707092--------------------
@friosavila https://github.com/friosavila Do you have any thoughts on this? Does the manual Stata version above correspond to what jwdid is doing underneath the hood?
— Reply to this email directly, view it on GitHub https://github.com/grantmcdermott/etwfe/issues/4#issuecomment-1329883375, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASZKKFUTXGWOELTKY6AXEI3WKU6DXANCNFSM6AAAAAARGNR464 . You are receiving this because you were mentioned.Message ID: @.***>
Going a bit further. In the linear model, It matters not if c.tr is interacted with c._x_lpop or not, because the expected value of _x_lpop is zero. But that is not the case for nonlinear models (the average impact will not be zero)
On Mon, Nov 28, 2022 at 9:57 PM Fernando Rios-Avila @.***> wrote:
Again, consider the code qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
The second line is being interacted with c.tr, however the way I programmed it in jwdid I do not interact that second line with the treatment. Right now I'm not sure which one is the correct approach, So I will go back to Wooldridge paper to see what he suggests.
At the very least, we know now the source of the difference! Fernando
On Mon, Nov 28, 2022 at 9:52 PM Fernando Rios-Avila @.***> wrote:
Ok it took a while of looking at the code! The source of the difference comes from how things are being interacted.
consider the Stata code
qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
asd
On Mon, Nov 28, 2022 at 6:31 PM Grant McDermott @.***> wrote:
Given the warning above, my best guess is that the design matrix is rank deficient, and that in order to compute predictions, R and Stata both drop a somewhat arbitrary set of columns.
Thanks @vincentarelbundock https://github.com/vincentarelbundock. I think this is the same point @friosavila https://github.com/friosavila was making earlier in the thread. For the record, though I don't think rank deficiency is the issue. fixest (what etwfe uses underneath the hood) drops these collinear terms at run time. We can also confirm the same results with base glm if we construct the dataset and (non-colinear) interactions ahead of time. Show "preconstructed" results
data("mpdta", package = "did") dat = mpdta |> transform( emp = exp(lemp), event = ifelse(first.treat != 0, year - first.treat, NA), .Dtreat = as.integer(year >= first.treat & first.treat != 0), lpop_dm = ave(lpop, first.treat, year, FUN = function(x) x - mean(x, na.rm=TRUE)), first.treat = factor(first.treat), year = factor(year) ) gmod = glm( emp ~ .Dtreat:first.treat:year / lpop_dm + lpop * (first.treat + year), family = 'poisson', data = dat )
get rid of NA (collinear) coefs and re-estimate ----
coefs = coefficients(gmod)coefs = names(coefs[which(!is.na(coefs))]) mm = model.matrix(gmod)mm = mm[, coefs] dat2 = as.data.frame(cbind(emp = dat$emp, mm[, -1])) gmod2 = glm( emp ~ ., family = "poisson", data = dat2 )# summary(gmod2)
construct counterfactual datasetsdat2_0 = dat2_1 = dat2Dcols = grep("Dtreat", colnames(dat2))dat2_0[, Dcols] = 0# dat2_1[, Dcols] = 1 # already true since .Dtreat = 1 by definition
get ATTs ----
dat = dat |> transform( yhat1 = predict(gmod2, newdata = dat2_1, type = "response"), yhat0 = predict(gmod2, newdata = dat2_0, type = "response") ) |> transform( att = yhat1 - yhat0 )
aggregate(att ~ event, FUN = mean, data = dat)#> event att#> 1 -4 0.000000#> 2 -3 0.000000#> 3 -2 0.000000#> 4 -1 0.000000#> 5 0 -25.349748#> 6 1 1.091751#> 7 2 -75.124632#> 8 3 -101.823979
Created on 2022-11-28 with reprex v2.0.2 https://reprex.tidyverse.org
All of this prompted me to try implementing the same, manual version of the ETWFE regression in Stata. Here (finally) I think we're getting closer to underlying issue. As you can see, I get exactly the same ATT results from this manual Stata version as I do in R.
. frause mpdta, clear (Written by R. )
. gen emp = exp(lemp)
. qui jwdid emp lpop, ivar(countyreal) tvar(year) gvar(first_treat) method(poisson)
. qui poisson emp c.tr#i.first_treat#i.year c.tr#i.first_treat#i.year#c._x_lpop /// lpop c.lpop#i.first_treat c.lpop#i.year i.first_treat i.year, /// cluster(countyreal)
. replace tr = 1 (2,209 real changes made)
. predict yhat1 (option n assumed; predicted number of events)
. replace tr = 0 (2,500 real changes made)
. predict yhat0 (option n assumed; predicted number of events)
. gen att=yhat1-yhat0
. gen event = year - first_treat if first!=0 (1,545 missing values generated)
. tabstat att, by(event)
Summary for variables: att Group variable: event
event | Mean---------+---------- -4 | 0 -3 | 0 -2 | 0 -1 | 0 0 | -25.34976 1 | 1.091738 2 | -75.12462 3 | -101.824---------+---------- Total | -8.707092--------------------
@friosavila https://github.com/friosavila Do you have any thoughts on this? Does the manual Stata version above correspond to what jwdid is doing underneath the hood?
— Reply to this email directly, view it on GitHub https://github.com/grantmcdermott/etwfe/issues/4#issuecomment-1329883375, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASZKKFUTXGWOELTKY6AXEI3WKU6DXANCNFSM6AAAAAARGNR464 . You are receiving this because you were mentioned.Message ID: @.***>
Ah, good catch @friosavila!
I haven't had time to look back into the Wooldridge paper myself yet. I've got some imminent work deadlines, but will dive back into it afterwards.
In the meantime, please do let me know if you figure out which is the right way. It's quite possible I made an error translating his original do files.
G
I just went through them, and I was the one who missed the interaction. Just submitted the new version to SSC Fernando
On Tue, Nov 29, 2022 at 1:48 AM Grant McDermott @.***> wrote:
Ah, good catch @friosavila https://github.com/friosavila!
I haven't had time to look back into the Wooldridge paper myself yet. I've got some work imminent deadlines, but will dive back into it afterwards.
In the meantime, please do let me know if you figure out which is the right way. It's quite possible I made an error translating his original do files.
G
— Reply to this email directly, view it on GitHub https://github.com/grantmcdermott/etwfe/issues/4#issuecomment-1330160291, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASZKKFRTRZEP3GJZPUIIK5DWKWRMJANCNFSM6AAAAAARGNR464 . You are receiving this because you were mentioned.Message ID: @.***>
I just went through them, and I was the one who missed the interaction. Just submitted the new version to SSC
Super. Thanks for confirming, Fernando. Really appreciate you helping me to troubleshoot and stoked that we cracked it in the end. Congrats on the new SSC release. I'll aim to submit this pkg to CRAN shortly too.
C.f. With the
jwdid
Poisson model at the very bottom of this post.The main results are exactly the same.
But the marginal effects are different.
@vincentarelbundock Do you have any ideas? (Sorry for tagging and running, but wanted to get this down quickly before I have to head out!)