jmboehm / GLFixedEffectModels.jl

Fast estimation of generalized linear models with high dimensional categorical variables in Julia
Other
33 stars 6 forks source link

add feature: poisson regression point estimates bias correction #29

Closed caibengbu closed 3 years ago

caibengbu commented 3 years ago

This pull request adds a new feature of point estimates bias correction for three-way network poisson regressions. It produces the same result as the stata package ppml_fe_bias. There's also some major modifications to the function in terms of formula parsing and data frame sorting.

TO-DO: add standard error correction as introduced in Weidner and Zylkin (2020)

codecov[bot] commented 3 years ago

Codecov Report

Merging #29 (d63388a) into master (7021ddf) will decrease coverage by 20.95%. The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #29       +/-   ##
===========================================
- Coverage   67.30%   46.35%   -20.96%     
===========================================
  Files          13       13               
  Lines         624      906      +282     
===========================================
  Hits          420      420               
- Misses        204      486      +282     
Impacted Files Coverage Δ
src/GLFixedEffectModels.jl 0.00% <ø> (ø)
src/utils/biascorr.jl 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7021ddf...d63388a. Read the comment docs.

caibengbu commented 3 years ago

There are also some minor discrepancies between the Stata package Weidner and Zylkin (2020) produced (ppml_fe_bias) and their paper:

  1. In their Appendix A.2.1 Analytical Bias Correction Formula, the construction of B hat and D hat always excludes the term where i == j. In ppml_fe_bias, they include the excluded term. Screen Shot 2021-07-14 at 7 22 04 PM
  2. Also in A.2, there is a strange sign change. It is implemented as a positive sign in the code but appears to be negative in the paper. Might be a typo. Screen Shot 2021-07-14 at 7 24 25 PM

I feel like they are equally important references and can't decide which one to adhere to (especially the first discrepancy). For test purposes, I align with ppml_fe_bias for now. This version is pretty much a step-by-step implementation of the paper. There might be other shortcuts we can take to simplify our codes.

caibengbu commented 3 years ago

There are also some minor discrepancies between the Stata package Weidner and Zylkin (2020) produced (ppml_fe_bias) and their paper:

Update on this: Discrepancy 1 is kind of a silly question to ask since in a trade gravity model a country doesn't import/export to themselves so observations whose i==j would never show up in the data. In their paper they excluded such observations and this is a result of how the likelihood function is defined. They also said that it is okay to include if used in other scenarios where observations whose i==j is not meaningless. Discrepancy 2 is a typo in the paper.