Open londumas opened 3 years ago
@londumas does some sample of the query_instance has a missing value? I think if dice-ml doesn't handle this scenario it is ok. Shouldn't the user woryy about supplying a legitimate value for all columns in query_instance? How would dice-ml go about generating CFs if the value in a column is not known apriori? We should probably raise an exception that the query_instance has missing values rather than erroring out during generation of counterfactuals.
Regards,
@gaugup I think a missing value is a legitimate value. Packages such as xgboost and lightgbm support features with missing values. I think DiCE should also handle missing values without having to impute them.
I came across this issue as well. In many cases missing values are very informative (just as much as "real" values), and since DiCE can handle models that accept missing values (such as LGBM, XGboost, CatBoost), it would be great if it was capable of handling missing values.
There does not seem to be a support for missing values currently. For example dealing with the 'Age' feature in the Titanic dataset: