tlamadon / pytwoway

Two way models in python
MIT License
24 stars 7 forks source link

Retrieving the firm and worker identifiers (version 0.2.6) #10

Closed santiagohermo closed 2 years ago

santiagohermo commented 2 years ago

This is a similar question to #5, but for an updated version.

My goal is to extract FEs from pytwoway and link them back to the original IDs. Following https://github.com/tlamadon/pytwoway/issues/2#issuecomment-901500838, I managed to extract the estimated FEs using 'attach_fe_estimates':True in the parameters of FEEstimator. I am having some trouble linking them back to my original worker and firm IDs. The reason is that Bipartite resets the index when using the clean method. (As you can see in the example here after cell 5.)

Is there a way to keep the original ids in Bipartite object to do this? Thanks!

adamoppenheimer commented 2 years ago

Hi Santiago,

As in issue #5, you still want to keep track of id changes - however, the parameter name has been updated to track_id_changes=True (rather than include_id_reference_dict=True) when you initialize your BipartitePandas DataFrame. Please make sure your BipartitePandas is version 1.0.19 or above to ensure the parameter naming is correct.

Then, as you mentioned, set 'attach_fe_estimates': True. If you are interested only in the estimated fixed effects, I would suggest setting 'feonly': True as well, as this ensures only the OLS is run.

Once the estimation is done, run df = bdf.original_ids(). This returns a Pandas DataFrame (not a BipartitePandas DataFrame) with your original ids.

Here is some example code:

import bipartitepandas as bpd
import pytwoway as tw

clean_params = bpd.clean_params({'connectedness': 'connected'})
bdf = bpd.BipartiteDataFrame(bpd.SimBipartite().simulate(), track_id_changes=True).clean(clean_params).collapse()

fe_params = tw.fe_params({'feonly': True, 'attach_fe_estimates': True})
fe_estimator = tw.FEEstimator(bdf, fe_params)
fe_estimator.fit()

df = bdf.original_ids()

To make this clearer for anyone who does not see this issue, I have also updated the FE notebook in the documentation (available here) to explain how to run the FE OLS and restore the original firm and worker ids.

Best, Adam