jmboehm / GLFixedEffectModels.jl

Fast estimation of generalized linear models with high dimensional categorical variables in Julia

Other

33 stars 6 forks source link

Add bias correction feature #26

Closed caibengbu closed 3 years ago

caibengbu commented 3 years ago

This pull request adds the bias correction feature to the GLFixedEffectModels package. It returns the same numerical results with alpaca::BiasCorr() in R.

TO-DO:

[x] allow the binwidth to be different than 0.
[ ] do more robustness test with missing values etc.

jmboehm commented 3 years ago

Great! Could you also add the following:

hardcoded values that you compare to the estimates with @test (see e.g. test/nlreg.jl)
a test with probit link?

codecov[bot] commented 3 years ago

Codecov Report

Merging #26 (115e138) into master (28b55d7) will increase coverage by 3.04%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #26      +/-   ##
==========================================
+ Coverage   76.95%   80.00%   +3.04%     
==========================================
  Files          12       13       +1     
  Lines         499      565      +66     
==========================================
+ Hits          384      452      +68     
+ Misses        115      113       -2

Impacted Files	Coverage Δ
src/GLFixedEffectModels.jl	`0.00% <ø> (ø)`
src/utils/biascorr.jl	`100.00% <100.00%> (ø)`
src/fit.jl	`82.19% <0.00%> (+1.04%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 28b55d7...115e138. Read the comment docs.

caibengbu commented 3 years ago

Thank you, Johannes. I updated the test script. The two-way Logit test was fine but the coefficient didn't match for models that use a Probit link.

I use RCall to get the result from alpaca::feglm and compare the result with our function nlreg. They appear to be different before bias correction with atol = 1e-4. It might be a matter of tolerance level or other stopping criteria since the gap is almost negligible. I'll look more into it. If you can provide some suggestions, it'll be very helpful as well.

jmboehm commented 3 years ago

Thank you. That's concerning-- yes, it would be great if you could look into what's causing the difference in the probit regressions (I should have really added these to the tests in the first place...). But that can go into a different PR, so feel free to comment out the probit test stuff for now.

By the way, to have biascorr_test.jl also run by the Github CI, you also need to add the file in test/runtests.jl. If you want to use some other dependencies in that script that are not already listed in Project.toml (like RCall), you'll have to add it to the test vector, otherwise the tests will fail.

[targets]
test = ["Test", "Random", "Statistics", "RDatasets", "StableRNGs"]

Feel free to message me on Slack if the tests keep failing.

jmboehm commented 3 years ago

Awesome, thank you!