py-econometrics / pyfixest

Fast High-Dimensional Fixed Effects Regression in Python following fixest-syntax
https://py-econometrics.github.io/pyfixest/
MIT License
172 stars 34 forks source link

Add benchmarks against `linearmodels` and `fastreg` #558

Closed s3alfisc closed 3 weeks ago

s3alfisc commented 3 months ago

Context

It would be great to add benchmarks against the following two python packages:

@apoorvalal has benchmarks against fastreg here, showing equal performance to pyfixest.

To Do

s3alfisc commented 3 months ago

@rafimikail would you be interested in picking this up?

rafimikail commented 3 months ago

Hi @s3alfisc, so this one is basically adding another two lines (linearmodels and fastreg) in our performance benchmarking line plots right?

s3alfisc commented 3 months ago

Yes, exactly! Maybe best to start with one of the two packages and divide this into two PRs? Is it ok if I assign you @rafimikail?

rafimikail commented 3 months ago

Certainly @s3alfisc , you can allocate this to me 👍

rafimikail commented 3 months ago

Hi @s3alfisc , wanted to confirm, to run_benchmarks.ipynb, i think i need to retrieve some data first that will be used in the notebook, do i need to run data_generation.r first before running the notebook or i could just get it from https://github.com/lrberge/fixest/tree/master/_BENCHMARK?

Thanks!

s3alfisc commented 3 months ago

Oh I completely overlooked this - you would have to run the data generation r script first. I can also do so quickly and send you the data as a csv?

rafimikail commented 3 months ago

Hey @s3alfisc , i tried to run the data generation r file but experiencing an error, need to find out why

But if you have the data/csv already, that would be helpful

Thanks

s3alfisc commented 3 months ago

Will send it in a moment :)

marcandre259 commented 4 weeks ago

I have been looking into running the benchmark with linearmodels. It's PanelOLS function, which does the efficient treatment of fixed-effects, only fits into the benchmark scenario with 2 fixed effects (dum1 + dum2).

s3alfisc commented 4 weeks ago

Hi @marcandre259 , super cool that you're looking at this! As far as I understand it, linearmodels has an AbsorbingOLS function that runs pyhdfe under the hood, which should allow for multiple fixed effects and non-panel data.