Open winston-zillow opened 1 year ago
Note: CausalRandomForestRegressor
with standard_mse
predicts fine on the same data.
More debug info: one of the trained tree seems to be bad:
print(rforest2.estimators_[10].feature_importances_)
=> [ 0. nan 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0.]
After these trees are removed the predictions won't be inf
but still may predicts to extreme negative values (-3e+13
.)
Hi, thanks for the report. The issue has been fixed recently in https://github.com/uber/causalml/pull/583.
Please, reinstall the package from source.
You can also generate the desired type of synth data by changing mode
parameter:
y, X, w, tau, b, e = synthetic_data(mode=1, n=10000, p=20, sigma=5.0)
In causal_trees_with_synthetic_data.ipynb you will get the following result:
Thanks. Reinstalling from source fixes the problem!
This still happens with my real world data. Some predictions result in nan
(rather than inf
.) Maybe there's still issue?
Hi. Could you please plot each tree from your fitted CausalRandomForestRegressor
using plot_causal_tree
in causalml.inference.tree.plot
and attach images?
You can also attach small dataset which reproduces the nan
issue.
Hi, I encounter the same nan issue using CausalRandomForestRegressor for the predict.
When using 'causal_mse' the nan ratio is around 10%. Using 'standard_mse' is better, but still have around 2% nan.
BTW, seems plot_causal_tree
only works for CausalTreeRegressor
, not for CausalRandomForestRegressor
.
Describe the bug After training the
CausalRandomForestRegressor
with criterioncausal_mse
on data with nuisance, many of the predicted ITE values areinf
.To Reproduce I changed the causal trees with synthetic data notebook to use data generated by
simulate_nuisance_and_easy_treatment
after training the
CausalRandomForestRegressor
with criterioncausal_mse
with the same codes:many of the predicted ITE values are
inf
.This is the case even if I change the nuisance to something simpler:
Expected behavior Should predict to valid values.
Environment (please complete the following information):
pandas==1.5.2
,scikit-learn==1.0.2