tlamadon / pytwoway

Two way models in python
MIT License
23 stars 6 forks source link

Saving and displaying of individual and firm fixed effects #2

Closed nreigl closed 3 years ago

nreigl commented 3 years ago

Hi,

I was wondering if there is a way to obtain individual fixed effects. My understanding is that the function tw.FEEstimator.get_fe_estimates() documented here provides such functionality. Could you provide a working example of how to obtain the individual fixed effects with that function. I have tried to use the User documentation example to supply the necessary arguments to the function but I do not understand what the positional argument in that function refers to.

Minimal working example

import pytwoway as tw
import pandas as pd

df = pd.read_csv("twoway_sample_data.csv")
# Create TwoWay object
tw_net = tw.TwoWay(df)
# Clean data
tw_net.prep_data()

fe_params = {
    'ncore': 1, # Number of cores to use
    'batch': 1, # Batch size to send in parallel
    'ndraw_pii': 50, # Number of draws to use in approximation for leverages
    'levfile': '', # File to load precomputed leverages
    'ndraw_tr': 5, # Number of draws to use in approximation for traces
    'he': True, # If True, compute heteroskedastic correction
    'out': 'res_fe.json', # Outputfile where results are saved
    'statsonly': False, # If True, return only basic statistics
    'Q': 'cov(alpha, psi)' # Which Q matrix to consider. Options include 'cov(alpha, psi)' and 'cov(psi_t, psi_{t+1})'
}

fe_est = tw.FEEstimator(df, params=fe_params)

fe_est is a pytwoway.fe.FEEstimator object. How do I apply the function tw.FEEstimator.get_fe_estimates() on that object?

adamoppenheimer commented 3 years ago

Here is a working example that builds from your code (I turned off the HE correction since that doesn't alter fixed effect estimates) (I also advise you to update bipartitepandas and pytwoway again - I added a small feature that will speed up runtimes for collapsed data by collapsing the data before it computes the connected set, and I also made it so a pointless warning won't appear anymore where it was dropping some columns):

import pytwoway as tw
import pandas as pd

df = pd.read_csv("twoway_sample_data.csv")
# Create TwoWay object
tw_net = tw.TwoWay(df)
# Clean data
tw_net.prep_data()

fe_params = {
    'ncore': 1, # Number of cores to use
    'batch': 1, # Batch size to send in parallel
    'ndraw_pii': 50, # Number of draws to use in approximation for leverages
    'levfile': '', # File to load precomputed leverages
    'ndraw_tr': 5, # Number of draws to use in approximation for traces
    'he': False, # If True, compute heteroskedastic correction
    'out': 'res_fe.json', # Outputfile where results are saved
    'statsonly': False, # If True, return only basic statistics
    'Q': 'cov(alpha, psi)' # Which Q matrix to consider. Options include 'cov(alpha, psi)' and 'cov(psi_t, psi_{t+1})'
}

# Notice that I don't use the original dataframe, I use the bipartite dataframe stored inside the TwoWay object
fe_solver = tw.FEEstimator(tw_net.data, params=fe_params)

# We need to fit the FE model first
fe_solver.fit_1()
fe_solver.construct_Q()
fe_solver.fit_2()

# Add columns with estimated parameters
hat_params = fe_solver.get_fe_estimates()
tw_net.data['alpha_hat'] = tw_net.data['i'].map(hat_params[1])
tw_net.data['psi_hat'] = tw_net.data['j'].map(hat_params[0])

Now your individual and firm fixed effects will be stored in the tw_net.data dataframe.

adamoppenheimer commented 3 years ago

I just uploaded a new version to Pip that only runs the FE estimator if all you want is the fixed effects. It also automatically adds the fixed effects as columns. This is the new code you can run:

import pytwoway as tw
import pandas as pd

df = pd.read_csv("twoway_sample_data.csv")
# Create TwoWay object
tw_net = tw.TwoWay(df)
# Clean data
tw_net.prep_data()

fe_params = {
    'ncore': 1, # Number of cores to use
    'batch': 1, # Batch size to send in parallel
    'ndraw_pii': 50, # Number of draws to use in approximation for leverages
    'levfile': '', # File to load precomputed leverages
    'ndraw_tr': 5, # Number of draws to use in approximation for traces
    'he': False, # If True, compute heteroskedastic correction
    'out': 'res_fe.json', # Outputfile where results are saved
    'statsonly': False, # If True, return only basic statistics
    'feonly': True, # If True, compute only fixed effects and not variances
    'Q': 'cov(alpha, psi)' # Which Q matrix to consider. Options include 'cov(alpha, psi)' and 'cov(psi_t, psi_{t+1})'
}

# Since we set 'feonly': True, we just run the estimator normally and it
# only estimates the fixed effects to save time
tw_net.fit_fe(fe_params)

# Now look at the data
new_data = tw_net.data
nreigl commented 3 years ago

Great. That is what I needed.

I am not 100% sure if I get the econometrics right here but is it possible the extract individual effects also for the CRE estimator? I suppose that is what the tw.cre.CREEstimator() function is designed for.