py-econometrics / pyfixest

Fast High-Dimensional Fixed Effects Regression in Python following fixest-syntax
https://py-econometrics.github.io/pyfixest/
MIT License
152 stars 28 forks source link

Add default IV Diagnostics to `pf.summary()` and `pf.etable()` #559

Open s3alfisc opened 2 months ago

s3alfisc commented 2 months ago

Context

With #556 , @Jayhyung has implemented support for the effective F statistics (Olea & Pflueger) as well as a "standard" first stage F test.

We should include these statistics by default in the output produced by pf.summary and pf.etable.

For example, r-fixest returns the following information:

library("fixest")
data("SchoolingReturns", package = "ivreg")

m_iv <- feols(log(wage) ~ ethnicity + smsa + south |
                education ~ nearcollege ,
              data = SchoolingReturns)

summary(m_iv)
# TSLS estimation, Dep. Var.: log(wage), Endo.: education, Instr.: nearcollege
# Second stage: Dep. Var.: log(wage)
# Observations: 3,010 
# Standard-errors: IID 
# Estimate Std. Error   t value   Pr(>|t|)
# (Intercept)    4.478879   0.792769  5.649666 1.7579e-08
# fit_education  0.134516   0.060379  2.227857 2.5964e-02
# ethnicityafam -0.053741   0.091175 -0.589425 5.5562e-01
# smsayes        0.062516   0.061017  1.024571 3.0565e-01
# southyes      -0.082268   0.035854 -2.294503 2.1830e-02
# 
# (Intercept)   ***
#   fit_education *  
#   ethnicityafam    
# smsayes          
# southyes      *  
#   ---
#   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# RMSE: 0.471246   Adj. R2: -0.129401
# F-test (1st stage), education: stat = 9.60092, p = 0.001963, on 1 and 3,005 DoF.
# Wu-Hausman: stat = 3.96929, p = 0.046428, on 1 and 3,004 DoF.

etable(m_iv)
# m_iv
# Dependent Var.:         log(wage)
# 
# Constant        4.479*** (0.7928)
# education        0.1345* (0.0604)
# ethnicityafam    -0.0537 (0.0912)
# smsayes           0.0625 (0.0610)
# southyes        -0.0823* (0.0358)
# _______________ _________________
# S.E. type                     IID
# Observations                3,010
# R2                       -0.12790
# Adj. R2                  -0.12940
# ---
#   Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

To Do

Include information on the first stage f statistics and effective f statistic to pf.summary() and pf.etable().

s3alfisc commented 2 months ago

Something else that could be cool: adding an include_first_stage argument so that instead of having to type

%load_ext autoreload
%autoreload 2

import pyfixest as pf
data = pf.get_data()

fit = pf.feols("Y ~ X2 + f2  | X1 ~ Z1", data = data, vcov = "iid")
pf.etable([fit._model_1st_stage, fit])

users could just call

%load_ext autoreload
%autoreload 2

import pyfixest as pf
data = pf.get_data()

fit = pf.feols("Y ~ X2 + f2  | X1 ~ Z1", data = data, vcov = "iid")
pf.etable([fit], include_first_stage)

and get identical output.