ufunc 'isnan' not supported for the input types in DML "effect()" function

jaydeepchakraborty commented 1 year ago

Thank you for the package and such huge effort. I am trying to do below estimation,

variables are, features: ['X1', 'X2', 'X3', 'X4', 'X5'], output: ['Y'], treatment: ['T_1', 'T_2'] Here, Type is categorical and values are (0, 1, 2)

test_seg = dml_test_X.iloc[[2,4]] # third and fifth rows print(test_seg) dml_est.effect(test_seg, T0=0, T1=1)

   X1      X2      X3           X4       X5

6 27 1 77.99 4.193 131.126667 10 60 1 76.65 3.717 223.173417

X1- continuous X2- categorical X3- continuous X4- continuous X5- continuous

treatment- categorical Y- continuous

ERROR: TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Possible reason: X2 and Treatment are categorical (pandas - dtype('o')) and np.isnan() throws error for "category" data type.

Possible Solution: replace np.isnan with pd.isna, which supports category dtypes?

kbattocchi commented 1 year ago

If you can provide a fully self-contained repro, that would help. However, internally we're using sklearn's OneHotEncoder to transform the treatment when it is discrete, and I suspect that this failure is a known issue with how that class interacts with pandas.

jaydeepchakraborty commented 1 year ago

@kbattocchi Thank you for replying.

I have this notebook, hope this helps. Please let me know if you need any information.

https://github.com/jaydeepchakraborty/NLP/blob/36dc367c2d84d39830a253e8c0e9629ca997e882/CI_test.ipynb

If we convert the object type column to int type. then we are able to run.

py-why / EconML

ufunc 'isnan' not supported for the input types in DML "effect()" function #745