How to handle categorical features

cohortshapley / cohortintgrad

Integrated Gradient Cohort Shapley

MIT License

8 stars 1 forks source link

Thank you for your comments. Since IGCS only refers the similarity of features, when the categorical feature is encoded to numerical and its level is few enough, IGCS works directly by distinguishing them and picking the exact matching data as similar or not. However, note that there are examples where the results can differ depending on before/after one-hot encoding in Cohort Shapley case as shown in Section 7.1 in https://arxiv.org/abs/2205.15750 . IGCS take an integral on indicator space (coordinated by z in our paper), always continuum variable, so even if the feature x is discrete, the integral does not turn out to be a summation.

cohortshapley / cohortintgrad

How to handle categorical features #4