Open DariuszMajerek opened 8 months ago
What version of atom are you using? That functionality was deprecated in 5.1.0 I believe. I see that the documentation was not updated accordingly, sorry for that. In the latest version the multioutput meta-estimator is assigned by default. Doing atom.multioutput = ...
doesn't do anything. So the same results make sense because you are using the same estimator (check it printing atom.rf.estimator
). So you can either downgrade to the previous version or you can assign the three estimators directly to the run
method (that way you also have all three models in the same atom instance).
atom.run(["RF", MultiOutputClassifier(RandomForestClassifier()), ClassifierChain(RandomForestClassifier())])
My version is 5.2.0. Unfortunately your example don't work for me. When I use your command, I've got:
Training ========================= >>
Models: RF, MOC, CC
Metric: average_precision
Results for RandomForest:
Fit ---------------------------------------------
Train evaluation --> average_precision: 1.0
Test evaluation --> average_precision: 0.6468
Time elapsed: 0.155s
-------------------------------------------------
Total time: 0.155s
Results for MultiOutputClassifier:
Fit ---------------------------------------------
Exception encountered while running the MOC model.
TypeError: MultiOutputClassifier.__init__() got an unexpected keyword argument 'estimator__bootstrap'
Results for ClassifierChain:
Fit ---------------------------------------------
Exception encountered while running the CC model.
TypeError: _BaseChain.__init__() got an unexpected keyword argument 'base_estimator__bootstrap'
Final results ==================== >>
Total time: 0.160s
-------------------------------------
RandomForest --> average_precision: 0.6468 ~
Consecutive runs of model RF. The former model has been overwritten.
I made a mistake. You have to specify in the custom model that the class doesn't need a multilabel wrapper.
from sklearn.datasets import make_multilabel_classification
from sklearn.multioutput import ClassifierChain, MultiOutputClassifier
from sklearn.ensemble import RandomForestClassifier
from atom import ATOMClassifier, ATOMModel
X, y = make_multilabel_classification(n_samples=300, n_classes=3, random_state=1)
atom = ATOMClassifier(X, y=y, verbose=2, random_state=1)
chain = ATOMModel(ClassifierChain(RandomForestClassifier()), native_multilabel=True)
multi = ATOMModel(MultiOutputClassifier(RandomForestClassifier()), native_multilabel=True)
atom.run(["rf", chain, multi])
Thanks for quick replay. Unfortunately this still don't work. There is no native_multilabel parameter in ATOMModel module. I've the following error:
TypeError: ATOMModel() got an unexpected keyword argument 'native_multilabel'
you are right. that's functionality of the development
branch, not yet released. The dev branch also contains a fix for the error you showed before (TypeError: _BaseChain.__init__() got an unexpected keyword argument 'base_estimator__bootstrap'
).
You can install atom
from that branch using pip install git+https://github.com/tvdboom/ATOM.git@development
. Then it should work.
Yes, it works :) Thank you for your help.
Contribution guidelines
Description
I've tried to compare three methods of Multilabel Classification by Random Forest. I wanted to check wich method will be the best MultiOutputClassifier, ClassifierChain or native multilabel RandomForestClassifier. To my surprise, all the results were identical. What is wrong then, since when I do the same calculations using sklearn I get different results. Could you help me.
test.pdf
Expected behaviour
No response
Actual behaviour
No response
Steps to reproduce
No response
Python and package version
import sys; sys.version
import atom; atom.__version__