jmboehm / GLFixedEffectModels.jl

Fast estimation of generalized linear models with high dimensional categorical variables in Julia
Other
33 stars 6 forks source link

Add pseudo r-squred values #50

Closed junder873 closed 1 year ago

junder873 commented 1 year ago

This adds support for calculating pseudo-r2 values. To do so, functions for loglikelihood and nullloglikelihood are added and the output type needs to store a few extra variables (y, mu and a dof).

One additional change I considered but did not implement is that the biascorrection tests often need the y and/or mu values, expecting these to be in the augment_df argument. Saving them all the time in order to calculate pseudo-r2 might make it unnecessary to keep those in the augment_df

jmboehm commented 1 year ago

One additional change I considered but did not implement is that the biascorrection tests often need the y and/or mu values, expecting these to be in the augment_df argument. Saving them all the time in order to calculate pseudo-r2 might make it unnecessary to keep those in the augment_df

Thanks. There are two disadvantages that I see: (1) we store much more data in the estimated model (I'd prefer them to be as light as necessary, in contrast to e.g. stargazer etc); (2) we need to specify the type (you coded y and mu as Vector{Float64}, whereas with lower-end GPUs you may prefer Float32s). So I'd suggest that we return those statistics only if the necessary information has been saved in augmentdf.

junder873 commented 1 year ago

As an alternative (that I think alleviates both of those), what if loglikelihood and nullloglikelihood were calculated within nlreg and then saved (similar to deviance and nulldeviance)? This would have a pretty minimal effect on speed and mean very little additional information is stored. Thoughts?

jmboehm commented 1 year ago

Agreed, I think that would be a good solution.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.34 :tada:

Comparison is base (ee22701) 90.89% compared to head (7032b28) 91.24%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #50 +/- ## ========================================== + Coverage 90.89% 91.24% +0.34% ========================================== Files 8 8 Lines 901 914 +13 ========================================== + Hits 819 834 +15 + Misses 82 80 -2 ``` | [Impacted Files](https://app.codecov.io/gh/jmboehm/GLFixedEffectModels.jl/pull/50?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Johannes+Boehm) | Coverage Δ | | |---|---|---| | [src/GLFixedEffectModels.jl](https://app.codecov.io/gh/jmboehm/GLFixedEffectModels.jl/pull/50?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Johannes+Boehm#diff-c3JjL0dMRml4ZWRFZmZlY3RNb2RlbHMuamw=) | `100.00% <ø> (ø)` | | | [src/utils/biascorr.jl](https://app.codecov.io/gh/jmboehm/GLFixedEffectModels.jl/pull/50?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Johannes+Boehm#diff-c3JjL3V0aWxzL2JpYXNjb3JyLmps) | `97.46% <ø> (ø)` | | | [src/GLFixedEffectModel.jl](https://app.codecov.io/gh/jmboehm/GLFixedEffectModels.jl/pull/50?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Johannes+Boehm#diff-c3JjL0dMRml4ZWRFZmZlY3RNb2RlbC5qbA==) | `82.83% <100.00%> (+2.21%)` | :arrow_up: | | [src/fit.jl](https://app.codecov.io/gh/jmboehm/GLFixedEffectModels.jl/pull/50?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Johannes+Boehm#diff-c3JjL2ZpdC5qbA==) | `85.76% <100.00%> (+0.45%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.