I am replicating a paper, and need to run a simple OLS regression with a fixed effect. To do that I am running the PanelOLS function from linearmodels. I'm estimating the effect of treatment on child mortality, here is the raw data: AEJ2018_child_mortality_computation.zip
Here is what I do on the dataset:
data_2 = pd.read_stata("AEJ2018_child_mortality_computation.dta")
Collapsing to the sum:
data_2 = data_2.groupby(['villageid', 'branchid', 'treatment'], as_index = True)[['death_under5','count_month_u5', 'death_under1', 'count_month_u1','death_under1m','count_month_u1m']].sum().reset_index()
Generating variable of interest:
data_2['count_month_u5'] = data_2.apply(lambda row: row.count_month_u5/12, axis = 1)data_2['mrate_u5'] = (data_2['death_under5']/data_2['count_month_u5'])*1000
Indexing:
data_2 = data_2.set_index(['villageid', 'branchid'], drop = False)
Model:
model = PanelOLS(data_2.mrate_u5, data_2.treatment, entity_effects = True, drop_absorbed=True) res = model.fit(cov_type = 'clustered', cluster_entity = True) print(res)
in
1 model = PanelOLS(data_2.mrate_u5, data_2.treatment, entity_effects = True, drop_absorbed=True)
----> 2 res = model.fit(cov_type = 'clustered', cluster_entity = True)
3 #print(res)
4 res
~\anaconda3\lib\site-packages\linearmodels\panel\model.py in fit(self, use_lsdv, use_lsmr, low_memory, cov_type, debiased, auto_df, count_effects, **cov_config)
1722 mu = 0
1723 total_ss = float((y - mu).T @ (y - mu))
-> 1724 r2 = 1 - resid_ss / total_ss
1725
1726 root_w = np.sqrt(self.weights.values2d)
ZeroDivisionError: float division by zero
At the same time, without fitting, i.e.:
`model = PanelOLS(data_2.mrate_u5, data_2.treatment, entity_effects = True, drop_absorbed=True)
print(res)`
everything works properly, and produces a valid regression output.
Equivalently, in R:
`library(plm)
model <- plm(mrate_u5~ treatment,
data = df, index = c("branchid"),
model = 'within')
summary(model)`
I'm okay without fitting, but I'm simply curious what I've done wrong this time. I've used _PanelOLS_ for other regressions with fitting and it worked nicely. Thanks!
Thanks!
Edit: for authorship reasons, leaving a link where I got all the data, it was provided by authors of the paper I'm replicatin https://www.openicpsr.org/openicpsr/project/116355/version/V1/view
Edit2: after restarting the kernel, even without fitting doesn't work
Greetings!
I am replicating a paper, and need to run a simple OLS regression with a fixed effect. To do that I am running the PanelOLS function from linearmodels. I'm estimating the effect of treatment on child mortality, here is the raw data:
AEJ2018_child_mortality_computation.zip
Here is what I do on the dataset:
data_2 = pd.read_stata("AEJ2018_child_mortality_computation.dta")
Collapsing to the sum:data_2 = data_2.groupby(['villageid', 'branchid', 'treatment'], as_index = True)[['death_under5','count_month_u5', 'death_under1', 'count_month_u1','death_under1m','count_month_u1m']].sum().reset_index()
Generating variable of interest:data_2['count_month_u5'] = data_2.apply(lambda row: row.count_month_u5/12, axis = 1)
data_2['mrate_u5'] = (data_2['death_under5']/data_2['count_month_u5'])*1000
Indexing:data_2 = data_2.set_index(['villageid', 'branchid'], drop = False)
Model:model = PanelOLS(data_2.mrate_u5, data_2.treatment, entity_effects = True, drop_absorbed=True) res = model.fit(cov_type = 'clustered', cluster_entity = True) print(res)
When fitting the model Python returns an error:
ZeroDivisionError Traceback (most recent call last)