Multilabel Classification method give the same results

DariuszMajerek commented 8 months ago

Contribution guidelines

[X] I've read the contribution guidelines.
[X] The documentation does not mention anything about my problem.
[X] There are no open or closed issues that are related to my problem.

Description

I've tried to compare three methods of Multilabel Classification by Random Forest. I wanted to check wich method will be the best MultiOutputClassifier, ClassifierChain or native multilabel RandomForestClassifier. To my surprise, all the results were identical. What is wrong then, since when I do the same calculations using sklearn I get different results. Could you help me.

test.pdf

Expected behaviour

No response

Actual behaviour

No response

Steps to reproduce

No response

Python and package version

Python: import sys; sys.version
ATOM: import atom; atom.__version__

tvdboom commented 8 months ago

What version of atom are you using? That functionality was deprecated in 5.1.0 I believe. I see that the documentation was not updated accordingly, sorry for that. In the latest version the multioutput meta-estimator is assigned by default. Doing atom.multioutput = ... doesn't do anything. So the same results make sense because you are using the same estimator (check it printing atom.rf.estimator). So you can either downgrade to the previous version or you can assign the three estimators directly to the run method (that way you also have all three models in the same atom instance).

atom.run(["RF", MultiOutputClassifier(RandomForestClassifier()), ClassifierChain(RandomForestClassifier())])

dax44 commented 8 months ago

My version is 5.2.0. Unfortunately your example don't work for me. When I use your command, I've got:

Training ========================= >>
Models: RF, MOC, CC
Metric: average_precision

Results for RandomForest:
Fit ---------------------------------------------
Train evaluation --> average_precision: 1.0
Test evaluation --> average_precision: 0.6468
Time elapsed: 0.155s
-------------------------------------------------
Total time: 0.155s

Results for MultiOutputClassifier:
Fit ---------------------------------------------

Exception encountered while running the MOC model.
TypeError: MultiOutputClassifier.__init__() got an unexpected keyword argument 'estimator__bootstrap'

Results for ClassifierChain:
Fit ---------------------------------------------

Exception encountered while running the CC model.
TypeError: _BaseChain.__init__() got an unexpected keyword argument 'base_estimator__bootstrap'

Final results ==================== >>
Total time: 0.160s
-------------------------------------
RandomForest --> average_precision: 0.6468 ~
Consecutive runs of model RF. The former model has been overwritten.

tvdboom commented 8 months ago

I made a mistake. You have to specify in the custom model that the class doesn't need a multilabel wrapper.

from sklearn.datasets import make_multilabel_classification
from sklearn.multioutput import ClassifierChain, MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier
from atom import ATOMClassifier, ATOMModel

X, y = make_multilabel_classification(n_samples=300, n_classes=3, random_state=1)

atom = ATOMClassifier(X, y=y, verbose=2, random_state=1)

chain = ATOMModel(ClassifierChain(RandomForestClassifier()), native_multilabel=True)
multi = ATOMModel(MultiOutputClassifier(RandomForestClassifier()), native_multilabel=True)

atom.run(["rf", chain, multi])

dax44 commented 8 months ago

Thanks for quick replay. Unfortunately this still don't work. There is no native_multilabel parameter in ATOMModel module. I've the following error:

TypeError: ATOMModel() got an unexpected keyword argument 'native_multilabel'

tvdboom commented 8 months ago

you are right. that's functionality of the development branch, not yet released. The dev branch also contains a fix for the error you showed before (TypeError: _BaseChain.__init__() got an unexpected keyword argument 'base_estimator__bootstrap').

You can install atom from that branch using pip install git+https://github.com/tvdboom/ATOM.git@development. Then it should work.

dax44 commented 8 months ago

Yes, it works :) Thank you for your help.

tvdboom / ATOM