Open kirane61 opened 4 months ago
the list of classes that you have is missing the number 8. You said you labeled from 0 to 9, but it looks like there is also a 10 in the list. Are the classes correctly labeled before passing them into TPOT?
If they are, one possibility is that the cross-validation split doesn't have enough of each class to split between the different folds, leading to some folds missing a step that is expected by xgboost?
Context of the issue
I'm encountering an issue while using TPOTClassifier with XGBoost where I receive a ValueError indicating invalid classes inferred from the unique values of y. The expected classes are [0 1 2 3 4 5 6 7 8 9], but the classes I have are [0 1 2 3 4 5 6 7 9 10]. Despite using stratified K-fold splits for cross-validation, one of the classes is missing.
I am using Tpot version 0.12.1. The dataset has 10 classes. Since Xgboost required the labels to be encoded, I have label them from 0 to 9. When I input this data into the Tpot, I am getting the following error: This error appears only when there is slightly large data. For the dataset with a smaller number of rows (~2k), it is working fine.
Is there any possible workaround to overcome this issue?