py-econometrics / pyfixest

Fast High-Dimensional Fixed Effects Regression in Python following fixest-syntax
https://py-econometrics.github.io/pyfixest/pyfixest.html
MIT License
118 stars 27 forks source link

Add Notebook on `Stargazer` and `pymarginaleffects` support to the docs #473

Open s3alfisc opened 1 month ago

s3alfisc commented 1 month ago

Add jupyter notebooks that shows how to use Stargazer and marginaleffects with pyfixest.

juanitorduz commented 3 weeks ago

Cool PR!

The error from the CI looks weird.

____________________ ERROR collecting tests/test_confint.py ____________________
ImportError while importing test module '/home/runner/work/pyfixest/pyfixest/tests/test_confint.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.12.3/x64/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_confint.py:1: in <module>
    import doubleml as dml
../../../.cache/pypoetry/virtualenvs/pyfixest-va_uSH9_-py3.12/lib/python3.12/site-packages/doubleml/__init__.py:1: in <module>
    from pkg_resources import get_distribution
E   ModuleNotFoundError: No module named 'pkg_resources'

I have seen DoubleML does not support python 3.12 https://github.com/DoubleML/doubleml-for-py/blob/main/setup.py

Maybe see https://stackoverflow.com/questions/7446187/no-module-named-pkg-resources

s3alfisc commented 3 weeks ago

Thanks for taking a look at the error! DoubelML is only used for testing simultaneous confidence intervals - I think I should just rework the tests to only run the DoubleML implementation once and then save it to a file that the tests can read from. This way, there's less room for error!

Btw, I am very excited about Stargazer support, I think that it produces very pretty tables =)

s3alfisc commented 3 weeks ago

This PR adds an internal Stargazer class that inherits from Stargazer, but customizes the regression output for the PyFixest use case by adding information on the employed fixed effects:

import pyfixest as pf

fit = pf.feols("Y ~ X1 | csw(f1, f2)"
Stargazer(fit.to_list())

will return a Stargazer table including info on the employed fixed effects.

image

@aeturrell, are there any other utility functions you'd like to see? E.g. we could add methods to easily add randomization inference / bootstrap p-values, but given that Stargazer.add_lines() is so convenient to use, I wonder if it is already overkill?

s3alfisc commented 3 weeks ago

Juan (@juanitorduz ) - is it considered bad practice to exclude dev dependencies from the pyproject.toml? If no, I would just drop doubleml from the dependencies. For testing against doubleml, I am now saving a static datafile in tests/data/dml_res.csv, and in principle, doubleml no longer has to be available when running the tests. But excluding it would mean that effectively, users couldn't replicate the entire test suite end to end in the default poetry environment of pyfixest.

juanitorduz commented 3 weeks ago

Juan (@juanitorduz ) - is it considered bad practice to exclude dev dependencies from the pyproject.toml? If no, I would just drop doubleml from the dependencies. For testing against doubleml, I am now saving a static datafile in tests/data/dml_res.csv, and in principle, doubleml no longer has to be available when running the tests. But excluding it would mean that effectively, users couldn't replicate the entire test suite end to end in the default poetry environment of pyfixest.

This sounds like a hack 😄. What about just keeping it in and skipping the tests for the specific Python version using skipif as described in https://docs.pytest.org/en/stable/how-to/skipping.html#id1 ?

juanitorduz commented 3 weeks ago

@s3alfisc did you remove the ReviewNB app?

aeturrell commented 3 weeks ago

This PR adds an internal Stargazer class that inherits from Stargazer, but customizes the regression output for the PyFixest use case by adding information on the employed fixed effects:

import pyfixest as pf

fit = pf.feols("Y ~ X1 | csw(f1, f2)"
Stargazer(fit.to_list())

will return a Stargazer table including info on the employed fixed effects.

image

@aeturrell, are there any other utility functions you'd like to see? E.g. we could add methods to easily add randomization inference / bootstrap p-values, but given that Stargazer.add_lines() is so convenient to use, I wonder if it is already overkill?

Oh wow, this is so cool! The only other metrics I can think of that you sometimes see on these tables is the BIC and the pseudo or adjusted R2. Though the latter sometimes replaces R2. But agree with your instinct that if it's easy to add extra lines then it's not something I'd worry too much about.

Once this is merged in and released, let me know, and I will make sure I (eventually!) update Coding for Economists to show how stargazer and pymarginaleffects can be used with pyfixest.

s3alfisc commented 3 weeks ago

@s3alfisc did you remove the ReviewNB app?

I was wondering about this as well - I think it didn't survive the migration to py-econometrics. Will fix this after work!

s3alfisc commented 3 weeks ago

I was wondering about this as well - I think it didn't survive the migration to py-econometrics. Will fix this after work!

Yes, that was it - you can take a look at the notebooks here.

The only other metrics I can think of that you sometimes see on these tables is the BIC and the pseudo or adjusted R2.

None of these are implemented yet, unfortunately 😅 I was thinking that it could be a nice add-on if pf.Stargazer allowed to easily add custom statistics in the right place - i.e. if randomization inference has been computed, to have a method add_line_ritest() that then adds the RI p-values in the right place. But not something for this PR I think =)

juanitorduz commented 3 weeks ago

@s3alfisc This week I am swamped with work 😢 . Fel free to merge this one as it overall looks good! I can take a better look next week and open a PR if necessary :) (when ReviewNB is back XD )

s3alfisc commented 3 weeks ago

This week I am swamped with work

No worries! I agree that both should be fine =)

Btw, the notebooks are back, it was indeed the package migration.

juanitorduz commented 1 day ago

@s3alfisc Apologies for the late review! I did not know about this integration with the py-marginal effects package! Super useful! I just added some suggestions to make the content more welcoming, especially for new users who are not familiar with the topic 💪

Other than that I think we should share this to the (Python) world :D

s3alfisc commented 1 day ago

@s3alfisc Apologies for the late review! I did not know about this integration with the py-marginal effects package! Super useful! I just added some suggestions to make the content more welcoming, especially for new users who are not familiar with the topic 💪

Thanks Juan! No worries at all - last week, it was me who was really swamped with work, so I can very much relate! I'll tackle your comments throughout the week =)

Other than that I think we should share this to the (Python) world :D

Yes, we should! I thing that py-marginaleffects is completely flying under the radar for some reason. I guess that after I polish the marginaleffects notebook, I could use it to promote the package a little bit =)

juanitorduz commented 13 hours ago

Yes! PLease do so! I want to explore the package integration further and most likely expand the examples :)