AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.
https://mlbox.readthedocs.io/en/latest/
Other
1.49k stars 274 forks source link

some questions about multi-class #60

Closed Twinkle123321 closed 5 years ago

Twinkle123321 commented 6 years ago

Hi, It's so wonderful tools.However,I wondered how it distinct the classification from regression ,by train label?When I do multi-class task,the model seemed to transform all data into float64(my train label's type is int64),so it do regression task.it's not correct .So should I set some params in model? thank you!

AxeldeRomblay commented 6 years ago

Hello ! Thanks ! Unfortunately if your target is int64 it is considered as regression... see issue #54

Twinkle123321 commented 6 years ago

Thank you for.your reply,however,when I change the data’s type into int32 or int64,it doesn't works.what types should I change?

brunosez commented 6 years ago

Hi Axel, Following the challenge meilleurdatascientistdefrance :-) It seems multi-class detection is not obvious with the line if (y_train.nunique() <= 2): task = "classification" in reader.py it limits to 2 class For xgboost , objective and num_class params are also to be defined. On local validation my score seems to be in first part of the LB log_loss 0,96 , winners to 0,90 https://www.meilleurdatascientistdefrance.com/

Regards Bruno

AxeldeRomblay commented 6 years ago

Nice work @brunosez ! do you know many DS have tried MLBox during the challenge ?

AxeldeRomblay commented 5 years ago

Hello, thanks for reporting this issue. In a next release, we will improve the automatic task detection and also add the possibility to override the task detection in case it is mis-detected...