Open londumas opened 3 years ago
Thanks, the reason for the error is due to use of labelencoder
from sklearn that expects a string. Having categoricals as numeric values is possible, but raises the risk of confusion in case a user does not explicitly provide the data type (and wanted it to be treated as a numerical column).
Therefore, it might be safer to pre-processes the categorical columns to be non-numeric, before passing to DiCE. That said, sometimes categorical variables can be integers. Will look to support this in a future release.
Hi!!
I got the same error, when running the function exp_genetic.generate_counterfactuals However, when I use exp_random.generate_counterfactuals, I don't get this error. Can you explain, why this error is only raised for the function exp_random.generate_counterfactuals?
Furthermore, I was trying to fix the error with the comments @amit-sharma and @londumas, but still didn't succeed in running the function. Can you possibly provide a more detailed solution?
Thank you!!
When using a dataset with categorical data, if some of these data are not strings, then the following line will produce a bug.
The solution is then to convert all the data to strings with what follows.
One can simply test this bug with the jupyter notebook: https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb by replacing the binary feature gender by integers: