py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.76k stars 713 forks source link

Bug when passing featurizer that requires more than 1 row in SparseLinearDrLearner #905

Open itamarfaran opened 1 month ago

itamarfaran commented 1 month ago

in file econml/utilities.py:75 (function name check_high_dimensional):

d_x = clone(featurizer, safe=False).fit_transform(X[[0], :]).shape[1]

when passing a transformer such as SplineTransformer, it raises an error:

ValueError: Found array with 1 sample(s) (shape=(1, 1)) while a minimum of 2 is required by SplineTransformer.

from what I understand this line only runs to infer the number of rows out, but it fails the whole fit process. this would also fail for transformers as OneHotEncoder as the number of new rows is dependent on the data

itamarfaran commented 1 month ago

https://github.com/py-why/EconML/pull/906