FixedEffects / FixedEffectModels.jl

Fast Estimation of Linear Models with IV and High Dimensional Categorical Variables
Other
225 stars 46 forks source link

Add drop_singletons option to partial_out() #263

Open droodman opened 5 months ago

droodman commented 5 months ago

Before handing things over to ivreg2, ivregdfe partials out the FE and identifies and drops singletons. I'm making reghdfejl optionally do the same thing now--call partial_out() and then use ivreg2 instead of FixedEffectsModels.jl for the core IV work--since it offers a lot more diagnostics and estimation options. So it would be good if partial_out() also optionally dropped singletons. I made the new drop_singletons option default to false so old results are not affected. Of course, this is inconsistent with the default for drop_singletons being true in fit().

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 96.49%. Comparing base (72ca0e0) to head (23688f0).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #263 +/- ## ========================================== + Coverage 96.48% 96.49% +0.01% ========================================== Files 8 8 Lines 655 657 +2 ========================================== + Hits 632 634 +2 Misses 23 23 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

matthieugomez commented 5 months ago

Thanks. the contract of `partial_out' is to return a dataframe with as many rows as the initial one, as long as align = true. Could you add a check that, when drop_singletons = true is specified, the function returns an error unless align is set to false? Thanks

matthieugomez commented 5 months ago

Another thing is that adding a fourth argument to the function return is breaking. I don't mind, i could tag a new version, but at this point, it'd be better to create a return type for partial_out.

droodman commented 3 months ago

I understand you want to be careful in changing the interface of partial_out() in a breaking way. But I'm not sure what "contract" means here. This package is not fully documented; in particular, partial_out() is not documented. And in general I think a mature API will have a functional interface so the caller need not interact directly with, and make assumptions about, the structure of the return object. E.g., as far as I know the StatsAPI() interface is all functional.

So I would propose adding new return values onto the end of the currently returned tuple, to be minimally breaking; documenting this function an and any others that are conceived of as making contractual commitments; and implementing the commitments through functions.

matthieugomez commented 2 months ago

partial_out is documented — have a look at all the documentation info at the top of the file.