AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.
https://mlbox.readthedocs.io/en/latest/
Other
1.49k stars 274 forks source link

How to force classification ? #87

Closed brunosez closed 4 years ago

brunosez commented 4 years ago

Hi, On this simple dataset, target type is detected as float, I try to force to int in pre-preprocessing but no way ? Do you have an idea ? Thanks Bruno

https://www.kaggle.com/c/learn-together/data

train[['Cover_Type']] = train[['Cover_Type']].astype(int) train.to_csv("../input/train2.csv",index=False)

AxeldeRomblay commented 4 years ago

Hello @brunosez, Could you please re-send me the link ? it does not work... Thanks !

brunosez commented 4 years ago

Hello Axel Here : https://www.kaggle.com/c/learn-together Have a nice day !

brunosez commented 4 years ago

In fact, same issue as 2y ago task = "regression" count = y_train.nunique()

        if (count <= 2):
            task = "classification"

        else:
            if (y_train.dtype == object):
                task = "classification"

I will force target column to be an object... Rgds

AxeldeRomblay commented 4 years ago

Indeed, if your target looks like this : y = ["2", "1", "10", ... ,"5"], it will be considered as a regression problem (which is often relevant...) . Otherwise, try to cast it like this : y = ["C2", "C1", "C10", ... ,"C5"].