paris-saclay-cds / ramp-workflow

Toolkit for building predictive workflows on top of pydata (pandas, scikit-learn, pytorch, keras, etc.).
https://paris-saclay-cds.github.io/ramp-docs/
BSD 3-Clause "New" or "Revised" License
68 stars 43 forks source link

"_estimator_type" Definition #260

Open BadrAlpha07 opened 3 years ago

BadrAlpha07 commented 3 years ago

While developing a new classifier class which has fit, predictand predict_proba methods, it worked fine locally on jupyter but unfortunately it did not work with ramp tests. You can find the error in the attached image. In fact, I checked the source code of ramp and I understood that it uses predict_proba for the CV that's why it asks for a 2D array, although my classifier's predict_proba works well and gives a 2D array. Prof. @tomMoral suggested to add an estimator._estimator_type = 'classifier' which solved the problem. As an enhancement, the problem can be solved by automatically adding this estimator type to the estimator by ramp, or check the ._estimator_type attribute before running the predictions and arise the error that expected ._estimator_type = 'classifier' but found... as done by sklean library.

image

agramfort commented 3 years ago

yes the problem is that the KerasClassifier class returns False to the sklearn is_classifier function

you should report this issue on the Keras issue tracker

BadrAlpha07 commented 3 years ago

Thanks you for your response, I thought their should be a problem with the keras classifier, but it works fine in ramps. But in our case I tried with another trees classifiers and put it as an estimator in my classifier class below and arise same error. So I think the issue in this case is with the class that I built, which works fine in jupyter.

class Classifier(BaseEstimator):
    def __init__(self, estimator):
        self.estimator = estimator

    def fit(self, X, y=None, **kwargs):
        self.estimator.fit(X, y)
        return self

    def predict(self, X, y=None):
        return self.estimator.predict(X)

    def predict_proba(self, X):
        return self.estimator.predict_proba(X)

    def score(self, X, y):
        return self.estimator.score(X, y)

Thank you again for this amazing tool ramp-workflow that helped me a lot with evaluating my models.

tomMoral commented 3 years ago

I asked @BadrAlpha07 to open this issue because several students had issues understanding what was the problem with there submissions for classifiers that where not detected as such.

I think having a more comprehensible error (for instance a warning with the error to suggest to check for _estimator_type) or an automated way to deal with this would be nice.