py-econometrics / pyfixest

Fast High-Dimensional Fixed Effects Regression in Python following fixest-syntax
https://py-econometrics.github.io/pyfixest/
MIT License
175 stars 35 forks source link

Coefplot: Plotting the same regression equation using different dataframes #720

Open escherpf opened 4 days ago

escherpf commented 4 days ago

I am trying to use for a list of models that fit the same equation (with the same variable names) but using different data frames. For example, I would like to plot one (or two) variables for each of those models to get something like this:

example_coefplot

However, as far as I can tell, this can only be achieved if the model specification is not exactly identical across models. If it is, generates what seems like a strange repeating pattern:

example_coefplot2

Hence, it seems like gets confused if the models in the list use the same formula. Is it currently possible to plot the same variable from the same model specification estimated on different different frames _as though they were different models?

Thank you.

s3alfisc commented 3 days ago

Thanks! The error arises because coefplot() loops over all models and assigns them their names based on the _model_name attribute - and as they are identical for different sample estimates, you get the strange behavior above. .

I see two options:

Option 1: You could work with the split argument:

import pyfixest as pf 

df = pf.get_data()
fit = pf.feols("Y ~ X1", split = "f1", data = df[df.f1.isin([1,2])])
fit.coefplot(coord_flip=False, keep = "X1")

image

Alternatively, you could overwrite the _model_name attribute by hand:

fit1 = pf.feols("Y ~ X1", data = df[df.f1.isin([1])])
fit2 = pf.feols("Y ~ X1", data = df[df.f1.isin([2])])

fit1._model_name
# 'Y~X1'
fit2._model_name
# 'Y~X1'

fit1._model_name += ", sample f1 = 1"
fit2._model_name += ", sample f1 = 2"
pf.coefplot([fit1, fit2], coord_flip=False, keep = "X1")

which also produces

image

Maybe it would be convenient to add a model_name argument to pf.coefplot() that would allow users to pass custom model names to avoid duplicates? Maybe we should even throw an error in case of duplicate model names?