Interface with numpy arrays

To some degree this is already currently possible, but not documented as a) I am not directly testing the API and b) because it has some quirks and is far from user friendly.

But yes, in principle, there is a Feols() class that only works on numpy arrays. In the current design, you would have to do the "demeaning" outside of Feols, but this could easily be changed. The Feiv and Fepois classes inherit from Feols.

This is how it works under the hood at the moment:

In a first step, we load all required functions and classes and create some data:

import numpy as np
from pyfixest.estimation.feols_ import Feols
from pyfixest.estimation.demean_ import demean
from pyfixest.estimation.feols_ import _get_vcov_type

N = 1000
X = np.random.normal(0, 1, N*2).reshape(N, 2)
f = np.random.choice(range(100), N, True).reshape(N, 1)
Y = X @  np.array([1,1]).reshape((2,1)) + f + np.random.normal(0, 1, N).reshape(N, 1)

# currently required inputs to Feols, should all be optional
weights = np.ones(N)
coefnames = ['x1', 'x2']
collin_tol = 1e-10
weights_name = None
weights_type = "fweights"

Now we need to demean Y and X:

# weights needs to be 1d for demean ... 
Y_d, _ = demean(Y, f, weights)
X_d, _ = demean(X, f, weights)

# weights needs to be 2d for Feols ... 
weights = weights.reshape((N,1))

We can then pass the demeaned data (and a lot of additional arguments) to the Feols class:

Fit = Feols(Y = Y_d, X = X_d, weights = weights, coefnames = coefnames, collin_tol = 1e-10, weights_name = weights_name, weights_type = weights_type)
Fit
# <pyfixest.estimation.feols_.Feols at 0x1af06c7ae90>

With this at hand, we can fit the model via the get_fit() method:

Fit.get_fit()
Fit._beta_hat
# array([0.9968034 , 1.01047411])

We can compute the vcov matrix and get CIs:

vcov_type = _get_vcov_type(vcov = "iid", fval = "f1")
fit.vcov(vcov = vcov_type)
fit.get_inference()

and can finally call .tidy():

fit.tidy()

I think I'd have to implement the following changes to really promote users to rely on Feols:

make all arguments of Feols except for Y and X optional
add a fval argument to Feols & handle the fixed effect demeaning within instead of outside of Feols (this already happens for Fepois)
exclude functionality that depends on the availability of coef names from Feols
rename some of Feols internal methods, e.g. get_fit() to fit().
update multiple post estimation methods to accept np.arrays as arguments instead of column names, e.g. for ritest(), it should be possible to either provide the resampvar as string, or a resampvar array.

All of this sounds certainly doable. Given that you've basically implemented the Gelbach-D already, I think I might be able to tackle this soon =)

py-econometrics / pyfixest

Interface with numpy arrays #447