Closed tahalpara closed 7 months ago
@arima-tsukasa This PR relates to your work. Please watch this PR as long as possible.
All modified and coverable lines are covered by tests :white_check_mark:
:exclamation: No coverage uploaded for pull request base (
main@cf5bcba
). Click here to learn what that means.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Description: Tried to implement Label Encoder for categorical features in a Multi-target Classification/Regression Scenario. (This is an ongoing issue)
Changes Made: There are 3 files that were edited
In the first case, I have added the logic where whenever a catgorical object is found in the features, we implement the Label Encoder thus ensuring all the categorical features are encoded. This solution proposed works for the multi-target scenario. But fails in the scenario where XGB Classifier is used.
I have created a test experiment script as below
Upon running this I get the below error
ValueError: Invalid classes inferred from unique values of
y. Expected: [0 1 2], got [1 2 3]
The above error is specific to XGBClassifier. Upon further investigation I found that Label Encoder does not works well with the latest version of XGBClassifier hence the issue. In this case the solution would be to use other encoding types like One Hot Encoder in the scenario where we have our selected model as XGBClassifier
Below is the reference link of the version issue: https://stackoverflow.com/questions/71996617/invalid-classes-inferred-from-unique-values-of-y-expected-0-1-2-3-4-5-got
In the second case, when the evaluation metric is not mentioned, sapientML considers the F1 score by default as the metric. To support multi target, there was a need to change the F1 score metric evaluation. I used a for loop to go throw individual target columns and calculate it's F1 score. In this way I was able to solve the evaluation metric error.
Further discussion and course of action needs to be evaluated