Open apoorvalal opened 1 month ago
To some degree this is already currently possible, but not documented as a) I am not directly testing the API and b) because it has some quirks and is far from user friendly.
But yes, in principle, there is a Feols()
class that only works on numpy arrays. In the current design, you would have to do the "demeaning" outside of Feols
, but this could easily be changed. The Feiv
and Fepois
classes inherit from Feols
.
This is how it works under the hood at the moment:
In a first step, we load all required functions and classes and create some data:
import numpy as np
from pyfixest.estimation.feols_ import Feols
from pyfixest.estimation.demean_ import demean
from pyfixest.estimation.feols_ import _get_vcov_type
N = 1000
X = np.random.normal(0, 1, N*2).reshape(N, 2)
f = np.random.choice(range(100), N, True).reshape(N, 1)
Y = X @ np.array([1,1]).reshape((2,1)) + f + np.random.normal(0, 1, N).reshape(N, 1)
# currently required inputs to Feols, should all be optional
weights = np.ones(N)
coefnames = ['x1', 'x2']
collin_tol = 1e-10
weights_name = None
weights_type = "fweights"
Now we need to demean Y
and X
:
# weights needs to be 1d for demean ...
Y_d, _ = demean(Y, f, weights)
X_d, _ = demean(X, f, weights)
# weights needs to be 2d for Feols ...
weights = weights.reshape((N,1))
We can then pass the demeaned data (and a lot of additional arguments) to the Feols
class:
Fit = Feols(Y = Y_d, X = X_d, weights = weights, coefnames = coefnames, collin_tol = 1e-10, weights_name = weights_name, weights_type = weights_type)
Fit
# <pyfixest.estimation.feols_.Feols at 0x1af06c7ae90>
With this at hand, we can fit the model via the get_fit()
method:
Fit.get_fit()
Fit._beta_hat
# array([0.9968034 , 1.01047411])
We can compute the vcov matrix and get CIs:
vcov_type = _get_vcov_type(vcov = "iid", fval = "f1")
fit.vcov(vcov = vcov_type)
fit.get_inference()
and can finally call .tidy()
:
fit.tidy()
I think I'd have to implement the following changes to really promote users to rely on Feols
:
Feols
except for Y
and X
optionalFeols
(this already happens for Fepois
)Feols
Feols
internal methods, e.g. get_fit()
to fit()
.np.arrays
as arguments instead of column names, e.g. for ritest()
, it should be possible to either provide the resampvar as string, or a resampvar array. All of this sounds certainly doable. Given that you've basically implemented the Gelbach-D already, I think I might be able to tackle this soon =)
Would be good to have a fixest analogue of
sm.OLS
that lets one use raw matrices and sidestep the need to construct data frames. This would not do the current (admittedly quite user-friendly but not very OOP) thing of fitting under the hood when initialized withpf.feols
but would instead have modular .fit(), .summary() etc methods.Is this currently possible and not documented?