Open AlexisMignon opened 3 years ago
Hi @AlexisMignon
Great write up and thank you for the example. Will find some time this upcoming week to run your code and reproduce the error to make sure I understand.
I agree with your suggestions. Short term bandaid here is better error handling so the user has a better idea. Longer term would be to support unseen categories and adapt. Let me know if you have any ideas there. Otherwise I'll look into it!
Thanks, Joe
I'm having issues with this as well. Missing categories in the validation set make it impossible for me to use this library at the moment.
In some cases, when there are categorical predictors, imputation fails. I give here a an example, with the stochastic imputer but my guess is that is comes from an improper encoding of categorical variables and it probably affects all predictive imputers. The most probable issue is that the category encoding is not robust to unseen categories.
While it might be a choice not to treat it directly, it should be nice to detect the case and provide users with a clear error message.
This is probably linked to : https://github.com/kearnz/autoimpute/issues/11#issue-411732513
The example below triggers a ValueError: