vdquadros / immigration_enclave

2 stars 3 forks source link

More comparisons #13

Open vdquadros opened 5 years ago

vdquadros commented 5 years ago

The main directories for this exercise are: 1) Card's original code and lst files: here 2) Our SAS code and lst files: here. Our code is a very slight modification of Card's code to adjust paths and things like that. We run the code using our dataset downloaded from ICPSR instead of Card's original dataset (which we don't have). 3) Our Stata code: here

For replicating table 6, we need data from 1980-2000, but not from 2005/06. The complete list of scripts needed to replicate Table 6 is below.

What we already know that I won't repeat in length:

New developments

The last two columns of Table 3 look different if ran in SAS vs. Stata. In SAS, "PROC GLM" is used to first regress log wage on a bunch of things and get the residual. Then, the variance of the log wage and of the residuals are calculated across MSAs. The original code for 1980 can be found here (starting in line 176). In Stata, instead of using "PROC GLM", I just use "reg". This generates different results.

Note also that this Stata results for the variances look a bit different from the ones I reported in Issue #9. I fixed a couple of things since then and thus the results now are closer to Card's.

Overall Residual
Card SAS Stata Card SAS Stata
Native men 1980 0.385 0.387 0.386 0.288 0.288 0.288
1990 0.462 0.452 0.452 0.322 0.319 0.317
2000 0.487 0.486 0.486 0.353 0.358 0.358
Native women 1980 0.317 0.316 0.316 0.269 0.268 0.268
1990 0.382 0.381 0.381 0.295 0.294 0.294
2000 0.408 0.408 0.408 0.313 0.320 0.320
Immigrant men 1980 0.444 0.444 0.444 0.321 0.321 0.334
1990 0.517 0.513 0.513 0.347 0.342 0.364
2000 0.557 0.557 0.557 0.390 0.391 0.409
Immigrant women 1980 0.343 0.343 0.343 0.291 0.291 0.296
1990 0.414 0.413 0.413 0.318 0.317 0.330
2000 0.484 0.484 0.484 0.367 0.369 0.380

If I then export the SAS dataset that generated the variances above and use it for generating Table 6, I get a Table 6 that:

The table 6 we get from both SAS and Stata can be found below. The tables were generated by Stata, but the equivalent results in SAS can be found in this link:

------------------------------------------------------------------------------------
                                1               2               3               4   
                             b/se            b/se            b/se            b/se   
------------------------------------------------------------------------------------
Log rel supply imm~e       -0.030***       -0.030***       -0.036***       -0.036***
                           (0.00)          (0.00)          (0.00)          (0.00)   
Log msa size 1980          -0.095***       -0.094***       -0.104***       -0.106***
                           (0.00)          (0.00)          (0.00)          (0.00)   
Log msa size 1990           0.101***        0.100***        0.113***        0.114***
                           (0.00)          (0.00)          (0.00)          (0.00)   
College share 1980          0.099***        0.098***        0.121***        0.124***
                           (0.02)          (0.02)          (0.02)          (0.02)   
College share 1990         -0.007          -0.006          -0.014          -0.016   
                           (0.01)          (0.01)          (0.01)          (0.01)   
Wage res native 1980        0.135***        0.137***        0.141***        0.136***
                           (0.01)          (0.01)          (0.01)          (0.01)   
Wage res imm 1980          -0.160***       -0.162***       -0.169***       -0.164***
                           (0.00)          (0.00)          (0.00)          (0.00)   
Mfg share in 1980          -0.225***       -0.226***       -0.258***       -0.256***
                           (0.01)          (0.01)          (0.01)          (0.01)   
Mfg share in 1990           0.194***        0.195***        0.227***        0.223***
                           (0.01)          (0.01)          (0.01)          (0.01)   
Lagged dep var                              0.004                          -0.009***
                                           (0.00)                          (0.00)   
constant                   -0.128***       -0.127***       -0.159***       -0.160***
                           (0.00)          (0.00)          (0.00)          (0.00)   
------------------------------------------------------------------------------------
r2                          0.210           0.210           0.203           0.203   
------------------------------------------------------------------------------------
------------------------------------------------------------------------------------
                                5               6               7               8   
                             b/se            b/se            b/se            b/se   
------------------------------------------------------------------------------------
Log rel supply imm~e       -0.058***       -0.054***       -0.078***       -0.072***
                           (0.00)          (0.00)          (0.00)          (0.00)   
Log msa size 1980          -0.040***       -0.039***       -0.058***       -0.054***
                           (0.00)          (0.00)          (0.00)          (0.00)   
Log msa size 1990           0.031***        0.034***        0.053***        0.052***
                           (0.00)          (0.00)          (0.00)          (0.00)   
College share 1980         -0.055***       -0.114***       -0.003          -0.052** 
                           (0.02)          (0.02)          (0.02)          (0.02)   
College share 1990         -0.022           0.045***       -0.045***        0.005   
                           (0.01)          (0.01)          (0.01)          (0.01)   
Wage res native 1980        0.309***        0.363***        0.338***        0.371***
                           (0.00)          (0.01)          (0.01)          (0.01)   
Wage res imm 1980          -0.224***       -0.287***       -0.248***       -0.288***
                           (0.00)          (0.00)          (0.00)          (0.00)   
Mfg share in 1980          -0.373***       -0.422***       -0.377***       -0.410***
                           (0.01)          (0.01)          (0.01)          (0.01)   
Mfg share in 1990           0.499***        0.546***        0.484***        0.518***
                           (0.01)          (0.01)          (0.01)          (0.01)   
Lagged dep var                              0.137***                        0.095***
                                           (0.00)                          (0.00)   
constant                   -0.061***       -0.085***       -0.139***       -0.143***
                           (0.00)          (0.00)          (0.00)          (0.00)   
------------------------------------------------------------------------------------
r2                          0.386           0.401           0.356           0.379   
------------------------------------------------------------------------------------

Many data checks using the .lst files

When dealing with SAS, we have two important files: the .sas and the .lst files. The .sas files are the scripts. The .lst files store any results that were printed while running the script. Thus, .lst saves the results from "PROC MEANS", "PROC PRINT", "POROC GLM", etc. Luckily, we have (almost) all of Card's .lst files. Thus, we can compare our results after running each single script by comparing the .lst files.

Take aways

Data checks by yea using the .lst files

Us Card
np2 link link
allnp2 link link
cell1 link link
t1 link link
supply1 link link
imm1 link link
indist link link
Us Card
np2 link link
allnp2 link link
cell1 link link
t1 link link
supply1 link link
imm1 link link
indist link link
Us Card
np2 link link
allnp2 link link
cell1 link link
t1 link link
supply1 link link
imm3 link link
imm2 link link
inflow3 link link
vdquadros commented 5 years ago

Hi @econisaac

Please see above and let me know if it makes sense.

Above, I am reporting Table 6 done in Stata using an intermediate dataset constructed in SAS. This table is the most similar to the Table 6 we obtain if we run everything in SAS.

The coefficients for HS-equivalent workers in this table are less similar to Card's than in our previous Table 6 (the one in Issue #9), but the coefficients for College-equivalent workers barely changed. Also notice that the R2 are higher in this table than in our previous table.

After looking at this data a lot, my feeling is that this table is the most consistent with the way Card wrote his paper. I think the differences stem from using a different dataset, especially for 1990. As I mentioned above, 1990 is the only year where our number of immigrants is not exactly the same as Card's.

I will check how our Rotemberg weights and the remaining Bartik tables change if we use the dataset that generated this last Table 6.

I would like to know, however, if there's anything else I should do about Table 6 itself.

Best, Victoria

econisaac commented 5 years ago

Hi Victoria,

This is very detailed, precise, and, overall, pretty great!

I think that we should proceed with the version of Table 6 where we use the residuals generated from SAS. I think that you should then push on to re-generating the Rotemberg weights and generating the various other tables and figures that we had discussed.

Let me know if you have any other questions.

Thanks again!

best, isaac