nignatiadis / RegressionDiscontinuity.jl

Estimation for sharp regression discontinuity designs.
MIT License
15 stars 5 forks source link

Adding McCrary (2008) test for manipulation in the running variable #15

Closed pvalenzuelac closed 3 years ago

pvalenzuelac commented 3 years ago

Hi Nikos! I have added the McCrary density test for manipulation in the running variable. I include some tests, and also some comparisons to established packages in R and in Stata for three datasets, including Lee (2008). The results matches. I added a small example in the README, plus the plots for the comparisons, and a test of time.
The package seems to perform well in relatively small samples. However, when I increased the sample size to one million, the Julia implementation is ~4 times slower than the other two. I still need to figure this out and improve the speed here. Let me know if you have any comment.

nignatiadis commented 3 years ago

Hi Pablo! Happy holidays!

Thanks a lot for contributing this code! I will take a more careful look later to figure out the slowdown (it should not happen with Julia)!

I have a few initial remarks:

I would be happy to talk more about any of these points or provide guidance!

pvalenzuelac commented 3 years ago

Hi Nikos! Happy holidays too! Sorry for the delay in my response. I was out of town for the holidays and just come back. I will start working on your comments. Thank you for them :) I will let you know if I have any issue.

codecov-io commented 3 years ago

Codecov Report

Merging #15 (167e75f) into master (7405152) will increase coverage by 0.06%. The diff coverage is 63.69%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #15      +/-   ##
==========================================
+ Coverage   61.78%   61.84%   +0.06%     
==========================================
  Files           7        8       +1     
  Lines         348      477     +129     
==========================================
+ Hits          215      295      +80     
- Misses        133      182      +49     
Impacted Files Coverage Δ
src/RegressionDiscontinuity.jl 100.00% <ø> (ø)
src/density_test.jl 61.36% <61.36%> (ø)
src/running_variable.jl 40.60% <85.71%> (+0.16%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7405152...167e75f. Read the comment docs.

pvalenzuelac commented 3 years ago

Hi Nikos! I hope this message finds you well. I added the points you made to the code. I created a McCraryTest type, with the default values to be those calculated by McCrary (2008). The fit(McCraryTest(), data::RunningVariable) returns FittedMcCraryTest.
Let me know if you have more comments! Best

nignatiadis commented 3 years ago

Thanks a lot @pvalenzuelac for contributing this and I apologize for the delay in getting back to you. The functionality looks great to me and I am going to merge it as soon as tests finish! I think some of my last remarks really pertain to the package more generally so I will try to think about how to resolve them from a broader perspective.

One last remark about the tests: they occasionally fail, because they are random (e.g. testing if a null p-value is >0.05 will fail 5% of the time :), that's why I commented these tests out for now). A fix is to 1) test the outcome over multiple replicates (to check for the statistical behavior of the p-values) and 2) to set a random simulation seed, say by writing:

using Random
Random.seed!(1)