vdquadros / immigration_enclave

2 stars 3 forks source link

Pre-trends #14

Open vdquadros opened 5 years ago

vdquadros commented 5 years ago

For each of the plots below, I do the following:

  1. Run one regression for each year (1980, 1990, 2000) and store the coefficient on the instrument. For each regression

    • the dependent variable is the appropriate one (either difference in mean wage residuals for HS-equivalent workers or for College-equivalent workers)
    • the set of controls is the same across all regressions. Note: None of the regressions includes the lagged dependent variable in the set of controls
    • for the "country" regressions, the instrument is the share of immigrants from that country living in location l in 1980
    • for the "aggregate bartik" regressions, the instrument is the appropriate one (either predicted inflow of HS-equivalent workers or of College-equivalent workers)
  2. Plot the coefficients on the instrument.

Note: The x-axis is showing 5-year intervals, but we only have data for 1980, 1990, and 2000. I will fix that.

High School equivalent workers

image image
image image
image image

College equivalent workers

image image
image image
image image
econisaac commented 5 years ago

Hi Victoria,

Thanks---thinking about this more...in the paper, we studied the reduced-form for Autor, Dorn and Hanson b/c the endogenous variable (imports from china to the u.s. scaled by industry composition) isn't really defined in the 1970s and 1980s (since there were no imports from China to the US).

I believe that in this context it is possible to define the endogenous variable in 1980 and 1990 and so it might make sense to report the IV estimates in all 3 periods. Does this make sense?

Sorry to be inconsistent. best, isaac

vdquadros commented 5 years ago

Hi Isaac,

I actually do not understand what you mean by:

so it might make sense to report the IV estimates in all 3 periods.

But I agree that it's possible to define the endogenous variable in 1980 and 1990, since the endogenous variable is just the share of immigrants living in a city.

I can think more about what you are trying to say, but if it's easy to explain it further, it might be more efficient.

Thanks, Victoria

econisaac commented 5 years ago

Yup (that was unclear---sorry!).

What I mean is that you'd use the time-invariant "z" (of shares of people from country k that live in location l in 1980 relative to local population) (or the "aggregate immigrant enclave" instrument) as an instrument for the endogenous variable (measured in 1980/1990/2000).

So in building Table 6, you ran a first stage:

(relative shares in 2000) = \pi (immigrant enclave instrument)

and then:

(relative group wages in 2000) = beta_0 (stuff) + beta (relative shares in 2000).

To compute the beta_k in the rotemberg weight table, you replaced "immigrant enclave instrument" with "country specific shares.

My proposal is to just replace the "relative shares in 2000" and the "relative wages in 2000" with their 1990 and then their 1980 values.

Hopefully that is clearer....

isaac

vdquadros commented 5 years ago

Hi Isaac,

Okay. I got it now. Instead of doing the reduced form just do the 2SLS with the dependent and endogenous variables for each year.

I will do it in the morning.

Thank you for explaining,

Best, Victoria

vdquadros commented 5 years ago

Let's take as an example the "Mexico - High School" as well as the "Aggregate Bartik - High School" plots.

ivregress 2sls resgap802 logsize80 logsize90 coll80 coll90 nres80 ires80 mfg80 mfg90 ///
                (relshs80 = shric1) [fweight = round(count90)]

So the dependent variable is "relative wages in 1980" (resgap802), the endogenous variable is "relative shares in 1980" (relshs80) and the instrument is country specific shares shric1 (the time-invariant "z").

ivregress 2sls resgap802 logsize80 logsize90 coll80 coll90 nres80 ires80 mfg80 mfg90 ///
                (relshs80 = hsiv) [fweight = round(count90)]

Now the instrument is the "aggregate immigrant enclave", and everything else is as in the previous regression.

The other regressions follow the same pattern. For 1990 and 2000 I use the appropriate dependent and endogenous variables (calculated in those years).

High School

image image
image image
image image

College

image image
image image
image image
econisaac commented 5 years ago

This is great. I guess we'll talk in 38 minutes. I think this is what we want. We just need to clean up these figures.

vdquadros commented 5 years ago

Hi Isaac,

Below are the adjusted figures. I am not fixing the y-axis range across plots because some confidence intervals would get pretty much invisible. But once we agree on a scale I can fix them for better comparison across plots.

High School

image image
image image
image image

College

image image
image image
image image
econisaac commented 5 years ago

this is great.

On Thu, Apr 4, 2019 at 4:17 PM vdquadros notifications@github.com wrote:

Hi Isaac,

Below are the adjusted figures. I am not fixing the y-axis range across plots because some confidence intervals would get pretty much invisible. But once we agree on a scale I can fix them for better comparison across plots. High School [image: image] https://user-images.githubusercontent.com/37562407/55593061-6ef17080-56ef-11e9-86a1-4b878e3ea54a.png [image: image] https://user-images.githubusercontent.com/37562407/55593089-829cd700-56ef-11e9-9be2-a2af886b9f3b.png [image: image] https://user-images.githubusercontent.com/37562407/55593111-8f212f80-56ef-11e9-96b2-82799893b1f0.png [image: image] https://user-images.githubusercontent.com/37562407/55593134-9c3e1e80-56ef-11e9-9d1d-38da2918895c.png [image: image] https://user-images.githubusercontent.com/37562407/55593141-a5c78680-56ef-11e9-8cbf-d5d4eda1ed1a.png [image: image] https://user-images.githubusercontent.com/37562407/55593162-c42d8200-56ef-11e9-94f9-b3bb8ce11da2.png College [image: image] https://user-images.githubusercontent.com/37562407/55594459-8e3ecc80-56f4-11e9-96a5-2ccfcf309ef9.png [image: image] https://user-images.githubusercontent.com/37562407/55594494-af072200-56f4-11e9-899e-cb2832d3e518.png [image: image] https://user-images.githubusercontent.com/37562407/55594450-841cce00-56f4-11e9-830c-3d95e0d17eba.png [image: image] https://user-images.githubusercontent.com/37562407/55594469-9bf45200-56f4-11e9-8d90-76c3afa189f0.png [image: image] https://user-images.githubusercontent.com/37562407/55594483-a4e52380-56f4-11e9-84fb-fb30e228e540.png [image: image] https://user-images.githubusercontent.com/37562407/55594505-b9c1b700-56f4-11e9-92f0-0b262a737ad6.png

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/vdquadros/immigration_enclave/issues/14#issuecomment-480096524, or mute the thread https://github.com/notifications/unsubscribe-auth/AoPjT5QmDIJBviES8rkuMieCZsE2tpypks5vdoghgaJpZM4cZeuX .

vdquadros commented 5 years ago

What do you think of adding Paul to this repo? Then he could also comment on the figures and things like that (since he cares about them).

econisaac commented 5 years ago

Before we let Paul at this let’s converge on a first draft of all this output (he has other aspects of the revision to work on...).

Sent from my phone. Apologies for excessive terseness.

On Apr 4, 2019, at 4:37 PM, vdquadros notifications@github.com wrote:

What do you think of adding Paul to this repo? Then he could also comment on the figures and things like that.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.

econisaac commented 5 years ago

Hi Victoria,

One point about this: I seem to remember that the "top 5" rotemberg weight sending countries differed between high school and college. But here the top 5 have the same 5 countries. So I think there must be the "wrong" countries for some of the college ones....

Another point: the standard errors look smaller than i would expect. What I think what you should be doing is plotting 95% confidence intervals, which would be the coefficients +/- 1.96* standard errors. Looking at Card's table 6, the standard errors should then be....much bigger....than what appears to be plotted (perhaps you are plotting coefficients +/- standard errors?).

thanks!

isaac

econisaac commented 5 years ago

Ok---that's great. Thank you for tracking down this issue.

Looking forward to chatting/seeing results (☺) tomorrow.

best, isaac

On Mon, Apr 8, 2019 at 11:39 PM vdquadros notifications@github.com wrote:

Hi Isaac,

Sorry I had missed your comment before. Thanks for pointing it out after the seminar today.

Okay. I don't have new plots right now, but I just figured out what is wrong with the confidence intervals. It was a more subtle issue about which sort of "weight" (pweight, fweight, etc) we should be using in the Stata regressions for the standard errors to be the same as in SAS.

They are now the same :)

Will do plots in the morning. Just wanted to give you the good news in case you are still awake haha

Best, Victoria

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/vdquadros/immigration_enclave/issues/14#issuecomment-481122570, or mute the thread https://github.com/notifications/unsubscribe-auth/AoPjT0px8m2pkPylIEcyXOHRnLugvwQlks5vfDWlgaJpZM4cZeuX .

vdquadros commented 5 years ago

Hi Isaac,

The plots are below.

Just to give you extra confirmation that these ones should be correct, I am adding the table below. It compares the coefficients and standard errors between original Card, SAS with our dataset and Stata with our dataset.

I also include the specific lines in the log files where you can find those coefficients.

For original Card, the lst file is here. For SAS, the lst file is here. For Stata, the log file is here.

A quick note: In the Stata log file, you can see that we merge 37 files and a few of the "using" files have one observation missing (123 obs. instead of 124). This is only because all these files contain observations at the city level and a couple of the files being merged contain information by subgroup "education x experience". It happens that the rmsa == 600 doesn't have any immigrant worker with high school education and X years of experience, say.

High School College
OLS IV OLS IV
Card SAS Stata Card SAS Stata Card SAS Stata Card SAS Stata
Log Relative Supply Imm/Native Coefficient -0.019 -0.02974 -.0297393 -.023 -0.03685 -.0368503 -0.03564 -0.05758 -.057598 -0.05985 -0.07829 -.0782939
Standard error 0.006 0.006140 .0061403 0.008 0.010411 .0099821 0.010597 0.008502 .008502 0.013718 0.016003 .0153436
Where to find Line 512 Line 516 Line 1035 Line 1279 Line 1264 Line 1184 Line 662 Line 662 Line 1095 Line 1395 Line 1378 Line 1301

Note that the Aggregate Bartik confidence intervals are now consistent with the IV estimates we found (in both SAS and Stata). For example, the High School standard error in the table above is 0.01, and the confidence interval in the Aggregate Bartik - High School below looks like around +-0.2.

High School

image image
image image
image image

College

image image
image image
image image
econisaac commented 5 years ago

awesome.

On Tue, Apr 9, 2019 at 12:52 PM vdquadros notifications@github.com wrote:

Hi Isaac,

The plots are below.

Just to give you extra confirmation that these ones should be correct, I am adding the table below. It compares the coefficients and standard errors between original Card, SAS with our dataset and Stata with our dataset.

I also include the specific lines in the log files where you can find those coefficients.

For original Card, the lst file is here https://github.com/vdquadros/immigration_enclave/blob/master/card2009_code_untouched/table6.lst . For SAS, the lst file is here https://github.com/vdquadros/immigration_enclave/blob/master/sas/code/table6.lst . For Stata, the log file is here https://github.com/vdquadros/immigration_enclave/blob/master/MSAlevel/code/log_table6_sas.log . High School College OLS IV OLS IV Card SAS Stata Card SAS Stata Card SAS Stata Card SAS Stata Log Relative Supply Imm/Native Coefficient -0.019 -0.02974 -.0297393 -.023 -0.03685 -.0368503 -0.03564 -0.05758 -.057598 -0.05985 -0.07829 -.0782939 Standard error 0.006 0.006140 .0061403 0.008 0.010411 .0099821 0.010597 0.008502 .008502 0.013718 0.016003 .0153436 Where to find Line 512 Line 516 Line 1035 Line 1279 Line 1264 Line 1184 Line 662 Line 662 Line 1095 Line 1395 Line 1378 Line 1301 High School [image: image] https://user-images.githubusercontent.com/37562407/55830319-82248780-5ac5-11e9-8046-e0adf45fd963.png [image: image] https://user-images.githubusercontent.com/37562407/55830434-c4e65f80-5ac5-11e9-997a-61eda85324ed.png [image: image] https://user-images.githubusercontent.com/37562407/55830444-cdd73100-5ac5-11e9-9561-3e433950d8e5.png [image: image] https://user-images.githubusercontent.com/37562407/55830457-d596d580-5ac5-11e9-8436-fbe2beeaef44.png [image: image] https://user-images.githubusercontent.com/37562407/55830480-dd567a00-5ac5-11e9-8596-9a548c59b2aa.png [image: image] https://user-images.githubusercontent.com/37562407/55830499-e6dfe200-5ac5-11e9-971a-c1ae825ff0a9.png College [image: image] https://user-images.githubusercontent.com/37562407/55830555-0840ce00-5ac6-11e9-861f-b4eae0df356d.png [image: image] https://user-images.githubusercontent.com/37562407/55830607-21e21580-5ac6-11e9-8003-e97d290c1720.png [image: image] https://user-images.githubusercontent.com/37562407/55830540-fe1ecf80-5ac5-11e9-90c9-b375b3d30b3c.png [image: image] https://user-images.githubusercontent.com/37562407/55830569-10990900-5ac6-11e9-9749-4486ae508ab8.png [image: image] https://user-images.githubusercontent.com/37562407/55830587-17c01700-5ac6-11e9-9052-92fefb94a5c5.png [image: image] https://user-images.githubusercontent.com/37562407/55830621-2b6b7d80-5ac6-11e9-837b-ba0ddbb3e8bb.png

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/vdquadros/immigration_enclave/issues/14#issuecomment-481409712, or mute the thread https://github.com/notifications/unsubscribe-auth/AoPjT6KoDT3n3AWNd55OkiveeOEGMkzbks5vfO93gaJpZM4cZeuX .