tlamadon / pytwoway

Two way models in python
MIT License
21 stars 6 forks source link

Extract effects from the tw.cre.CREEstimator() function #4

Open nreigl opened 2 years ago

nreigl commented 2 years ago

I have closed issue #2 as it concerned the Fixed Effects estimator. I had a side note in this issue regarding individual effects for the CRE estimator.

I am not 100% sure if I get the econometrics right here but is it possible the extract individual effects also for the CRE estimator? I suppose that is what the tw.cre.CREEstimator() function is designed for.

adamoppenheimer commented 1 year ago

Hi Nicolas,

Sorry it has taken so long for me to write a response.

In the almost year since you originally asked this I still haven't taken a deep look into the theory of the CRE estimator, and consequently haven't looked at the CRE code. However, I finally just took a quick scan through the code.

From my minimal understanding of the how the CRE estimator works, and what I've gathered from looking through the code, the estimator is computing something like a worker fixed effect by grouping together all workers who move between the same pair of firm clusters in consecutive periods, i.e. it basically does groupby(['g1', 'g2']) and removes the mean wage at the cluster level, and the average over this residual is what it estimates as a sort of worker effect.

This average residual worker effect at the cluster level is then saved in the class attribute .between_params under the key EEm for movers (so you'll have wages for each g1-g2 pair) and under the key Em for stayers (so you'll just have wages for a single firm cluster).

In another step, the estimator takes another residual, this time also removing this pseudo-worker effect. It then computes, for each observation, the average wage of workers at the same firm but removing the wages of workers who move to the same cluster (e.g. if a worker is at firm 0 and moves to cluster 2, then this mean will be over workers at firm 0 who move to a cluster other than cluster 2). These variables are called y1m1j_lo, y2m1j_lo, y1m2j_lo, and y2m2j_lo in the code (lo for leave-out). The way the code is currently written, I don't believe these are accessible after the estimator has run.

However, I believe the main purpose of the CRE estimator is to provide an alternative estimation of var(psi) and cov(psi, alpha), not to estimate fixed-effects directly. So I don't think the values that are being constructed are really meant to be used outside of this specific purpose.

I apologize if my understanding of this is completely wrong, but I suggest you take a look through the code yourself, since it's much easier to understand than I originally expected.

Please let me know if this helps answer your question, and if you have other questions.