autogluon / autogluon

Fast and Accurate ML in 3 Lines of Code
https://auto.gluon.ai/
Apache License 2.0
7.79k stars 910 forks source link

[BUG] TabluarPredictor for multiclass classification is not working when fitting with preset 'interpretable' #3841

Closed dreyes17 closed 3 months ago

dreyes17 commented 9 months ago

Bug Report Checklist

Describe the bug

When using TabularPredictor with problem_type='multiclass' and fitting with the parameter presets='interpretable' it gives the following error: AssertionError: Unknown y_pred_proba format for problem_type="multiclass".

Expected behavior

The execution should finish correctly or an error message should be printed notifying the user that the preset interpretable is not available for problem_type multi class

To Reproduce

from autogluon.tabular import TabularPredictor, TabularDataset
import pandas as pd

N_SUBSAMPLE = 500  # subsample datasets for faster demo
N_TEST = 50

train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')  # can be local CSV file as well, returns Pandas DataFrame
train_data = train_data.sample(N_SUBSAMPLE, random_state=0)
test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv') # another Pandas DataFrame
test_data = test_data.sample(N_TEST, random_state=0)

label = 'relationship'

y_train = train_data[label]
y_test = test_data[label]
X_train = pd.DataFrame(train_data.drop(columns=[label]))
X_test = pd.DataFrame(test_data.drop(columns=[label]))

predictor_multi = TabularPredictor(label=label, problem_type='multiclass').fit(train_data, presets='interpretable', time_limit=20)

Screenshots / Logs

Installed Versions

``` INSTALLED VERSIONS ------------------ date : 2023-12-29 time : 14:02:43.069604 python : 3.9.6.final.0 OS : Darwin OS-release : 23.1.0 Version : Darwin Kernel Version 23.1.0: Mon Oct 9 21:28:12 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T8103 machine : arm64 processor : arm num_cores : 8 cpu_ram_mb : 8192.0 cuda version : None num_gpus : 0 gpu_ram_mb : [] avail_disk_size_mb : 7850 accelerate : 0.21.0 async-timeout : 4.0.3 autogluon : 1.0.0 autogluon.common : 1.0.0 autogluon.core : 1.0.0 autogluon.features : 1.0.0 autogluon.multimodal : 1.0.0 autogluon.tabular : 1.0.0 autogluon.timeseries : 1.0.0 boto3 : 1.34.4 catboost : 1.1.1 defusedxml : 0.7.1 evaluate : 0.4.1 fastai : 2.7.13 gluonts : 0.14.3 hyperopt : 0.2.7 imodels : 1.4.1 jinja2 : 3.1.2 joblib : 1.3.2 jsonschema : 4.17.3 lightgbm : 4.1.0 lightning : 2.0.9.post0 matplotlib : 3.7.1 mlforecast : 0.10.0 networkx : 3.2.1 nlpaug : 1.1.11 nltk : 3.8.1 nptyping : 2.4.1 numpy : 1.24.2 nvidia-ml-py3 : 7.352.0 omegaconf : 2.2.3 onnxruntime-gpu : None openmim : 0.3.9 orjson : 3.9.10 pandas : 2.1.4 Pillow : 10.1.0 psutil : 5.9.4 PyMuPDF : None pytesseract : 0.3.10 pytorch-lightning : 2.0.9.post0 pytorch-metric-learning: 1.7.3 ray : 2.6.3 requests : 2.28.2 scikit-image : 0.20.0 scikit-learn : 1.3.2 scikit-learn-intelex : None scipy : 1.9.1 seqeval : 1.2.2 setuptools : 60.2.0 skl2onnx : None statsforecast : 1.4.0 statsmodels : 0.14.1 tabpfn : None tensorboard : 2.15.1 text-unidecode : 1.3 timm : 0.9.12 torch : 2.0.1 torchmetrics : 1.1.2 torchvision : 0.15.2 tqdm : 4.65.2 transformers : 4.31.0 utilsforecast : 0.0.10 vowpalwabbit : None xgboost : 2.0.3 ```
Innixma commented 3 months ago

Refer to answer in #3860