dswah / pyGAM

[HELP REQUESTED] Generalized Additive Models in Python
https://pygam.readthedocs.io
Apache License 2.0
865 stars 159 forks source link

Does pyGAM support multi-class classification? #196

Open zhangxz1123 opened 6 years ago

dswah commented 6 years ago

@zhangxz1123 wow you read my mind! i was just thinking about this yesterday!

no. currently pygam does not support multiclass classification.

however you can quickly extend pygam to do so.

if you have M classes, then you can train M one-vs-all models. when you predict, keep the label for the model that outputs the highest probability.

this is a little less efficient, but about the same as the softmax activation.

jolespin commented 5 years ago

I was thinking about this as well!

Would this work? http://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html

model__multiclass = OneVsRestClassifier(LogisticGAM())

model__multiclass
# OneVsRestClassifier(estimator=LogisticGAM(callbacks=['deviance', 'diffs', 'accuracy'],
#    fit_intercept=True, max_iter=100, terms='auto', tol=0.0001,
#    verbose=False),
#           n_jobs=1)
dswah commented 5 years ago

@jolespin oh wow, i forgot about that class!

right, does that work? i havent tried it out.

jolespin commented 5 years ago

@dswah I couldn't get it to work:

from pygam import LogisticGAM
from sklearn.ensemble import OneVsRestClassifier

base_estimator = LogisticGAM(n_splines=20)
ensemble = OneVsRestClassifier(base_estimator, n_jobs=1)
ensemble.fit(X_iris, y_iris)
model_selection.cross_val_score(ensemble, X=X_iris, y=y_iris, cv=10)
# ---------------------------------------------------------------------------
# AttributeError                            Traceback (most recent call last)
# ~/anaconda/envs/python3/lib/python3.6/site-packages/sklearn/multiclass.py in _predict_binary(estimator, X)
#      94     try:
# ---> 95         score = np.ravel(estimator.decision_function(X))
#      96     except (AttributeError, NotImplementedError):

# ~/anaconda/envs/python3/lib/python3.6/site-packages/pygam/terms.py in __getattr__(self, name)
#     978 
# --> 979         return self._super_get(name)
#     980 

# ~/anaconda/envs/python3/lib/python3.6/site-packages/pygam/terms.py in _super_get(self, name)
#     899     def _super_get(self, name):
# --> 900         return super(MetaTermMixin, self).__getattribute__(name)
#     901 

# AttributeError: 'LogisticGAM' object has no attribute 'decision_function'
dswah commented 5 years ago

blah, well thats annoying.

@jolespin thank you for trying!

dswah commented 5 years ago

this PR (https://github.com/dswah/pyGAM/pull/213) appears to do the trick

jolespin commented 5 years ago

Should this be a default for multiclassification case?

dswah commented 5 years ago

@jolespin do you mean that pygam should import scikit-learn?

i think pygam should NOT import sklearn because it is such a big library with its various dependencies....

but then perhaps it doesn't make sense to add the decision_function method, since it assumes that users will use sklearn?

kra268 commented 10 months ago

Since this is still open, is it a good approach to have your target split into 'n' 1vsAll class manually and fit 'n' GAMs? I understand it can be tedious for a large number of classes but for example, if you have 3 classes = [1,2,3], you can have 3 GAMs where each GAM is a binary classifier of class 'i' vs. rest=0. This is assuming that splitting your target is simple enough. You could do this manually instead of using OVR from sklearn.