py-econometrics / duckreg

Every big regression is a small regression with weights.
MIT License
17 stars 4 forks source link

Small Optimization of HC1 VCOV Computation #8

Closed s3alfisc closed 1 month ago

s3alfisc commented 1 month ago

Avoids creating a (potentially) big diagonal matrix.

Equivalence demonstration and benchmarks:

import pyfixest as pf
import numpy as np
import time

data = pf.get_data(N = 10_000)

fit = pf.feols("Y ~ C(f1) + C(f2)", data=data)
X = fit._X
uhat = fit._u_hat.reshape(-1,1)
print(np.allclose(X.T @ np.diag(uhat.flatten()) @ X, (X * uhat).T @ X))
# True

tic = time.time()
X.T @ np.diag(uhat.flatten()) @ X
toc = time.time()
print("first run", toc - tic)
# 0.517794132232666

tic = time.time()
(X * uhat).T @ X
toc = time.time()
print("second run", toc - tic)
# 0.010312080383300781

Will not matter for this package for 99% of use cases, but help with the occasional 1% =)

apoorvalal commented 1 month ago

this broke SE estimation on my end btw (i initially merged because pr checkout didn't work and but then reverted); i'll take a look at what might be going on later.

apoorvalal commented 1 month ago

this prompted me to add an SE test; maybe rebase on main and test against that

s3alfisc commented 1 month ago

Oh wow, sorry about that! Should have tested more thoroughly. Will rebase and test against the new set of tests. Thanks!