py-econometrics / pyfixest

Fast High-Dimensional Fixed Effects Regression in Python following fixest-syntax
https://py-econometrics.github.io/pyfixest/pyfixest.html
MIT License
117 stars 27 forks source link

Add 1st stage regression in Feiv class #525

Closed Jayhyung closed 3 days ago

Jayhyung commented 1 week ago

Hi Alex @s3alfisc! More details with this PR will follow soon.

codecov[bot] commented 1 week ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Files Coverage Δ
pyfixest/estimation/FixestMulti_.py 82.54% <ø> (ø)
pyfixest/estimation/feiv_.py 97.61% <100.00%> (ø)
tests/test_iv.py 100.00% <100.00%> (ø)

... and 27 files with indirect coverage changes

s3alfisc commented 1 week ago

Very cool! Looking forward to it =)

Jayhyung commented 1 week ago

There are a few concerns that I have working on this issue.

  1. Does feiv class in the codebase allow multiple endogenous independent variables? I'm asking this question because the IV test cases in the test code(line 97 ~ 117 in test_vs_fixest.py) only allow the single case. The first stage in my code only allows the single case, but if you would like to generalize the code that allows the multiple cases, I'm happy to work more on this!
  2. I defined several attributes within feiv class related with the first stage regression(_pi_hat : estimated coefficient, X_hat : predicted values, _v_hat : residuals). Please, let me know if you want me to define more/less attributes related with the first stage.
  3. I would add a test code that evaluates whether the result(e.g. _pi_hat) of the first stage regression is the same with the result from feols. Do you think this is a reasonable way to test the 1st stage part of the get_fit method in feiv class? Also, where do i have to add the test code in test_vs_fixest.py exactly?

Thanks!

s3alfisc commented 6 days ago

Hi @Jayhyung - sorry for my delayed response, I have been swamped for work over tethe last days :/

Does feiv class in the codebase allow multiple endogenous independent variables? I'm asking this question because the IV test cases in the test code(line 97 ~ 117 in test_vs_fixest.py) only allow the single case. The first stage in my code only allows the single case, but if you would like to generalize the code that allows the multiple cases, I'm happy to work more on this!

Yes, only one endogenous variable is allowed. It would be quite cool to allow for more, but this would require to rework the FixestFormulaParser and it might not be a small endeavor. So for now, I would recommend to maybe keep this as a separate issue?

I defined several attributes within feiv class related with the first stage regression(_pi_hat : estimated coefficient, X_hat : predicted values, _v_hat : residuals). Please, let me know if you want me to define more/less attributes related with the first stage.

For now, I think that coefficient, predicted values and residuals is all that we need, right? So this sounds good to me =)

I would add a test code that evaluates whether the result(e.g. _pi_hat) of the first stage regression is the same with the result from feols. Do you think this is a reasonable way to test the 1st stage part of the get_fit method in feiv class? Also, where do i have to add the test code in test_vs_fixest.py exactly?

Sounds like a good strategy =) I think that you could just create a new test file, e.g. test_iv.py?

Jayhyung commented 6 days ago

Thanks for the feedback! I will work on my next iteration. Stay tuned. :)

Jayhyung commented 6 days ago

Just added the error testing file! Please let me know if anything extra should be done.

Jayhyung commented 5 days ago

I added endogvar as you suggested to the next iteration. Let me know if we need anything extra to be done!

Jayhyung commented 4 days ago

Thanks! Just fixed the formatting issue. Let me know if anything comes up!

s3alfisc commented 3 days ago

Perfect. It's merged =) thanks @Jayhyung!