estimator compatibility issues with sklearn

NeroHin commented 1 year ago

Hello, I've used the NGboost with sklearn ensemble voting to be a base estimator, But it's showing ValueError: The estimator NGBClassifier should be a classifier..

At the same time, I used Xgboost and LightGBM to be the base estimators.

When I checked their type of class and saw the API name are <class 'xgboost.sklearn.XGBClassifier'> and <class 'lightgbm.sklearn.LGBMClassifier'>

But NGboost show <class 'ngboost.api.NGBClassifier'>. Could anyone modify the API class with compatibility with the sklearn estimator and test it?

The test code is:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
import lightgbm as lgb
from ngboost import NGBClassifier

from sklearn.metrics import accuracy_score
from sklearn.ensemble import VotingClassifier

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

xgb = XGBClassifier()
lgb = lgb.LGBMClassifier()
ngb = NGBClassifier()

voting_clf = VotingClassifier(
    estimators=[('xgb', xgb), ('lgb', lgb), ('ngb', ngb)],
    voting='soft').fit(X_train, y_train)

y_pred_voting = voting_clf.predict(X_test)
print(accuracy_score(y_test, y_pred_voting))

Reference:

Thanks !

NeroHin commented 1 year ago

The same problem into sklearn ensemble stacking model, such as ValueError: 'final_estimator' parameter should be a classifier. Got NGBClassifier(random_state=RandomState(MT19937) at 0x7F007D5E8840)

NeroHin commented 1 year ago

Hello, I've used the NGboost with sklearn ensemble voting to be a base estimator, But it's showing ValueError: The estimator NGBClassifier should be a classifier..

At the same time, I used Xgboost and LightGBM to be the base estimators.

When I checked their type of class and saw the API name are <class 'xgboost.sklearn.XGBClassifier'> and <class 'lightgbm.sklearn.LGBMClassifier'>

But NGboost show <class 'ngboost.api.NGBClassifier'>. Could anyone modify the API class with compatibility with the sklearn estimator and test it?

The test code is:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
import lightgbm as lgb
from ngboost import NGBClassifier

from sklearn.metrics import accuracy_score
from sklearn.ensemble import VotingClassifier

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

xgb = XGBClassifier()
lgb = lgb.LGBMClassifier()
ngb = NGBClassifier()

voting_clf = VotingClassifier(
    estimators=[('xgb', xgb), ('lgb', lgb), ('ngb', ngb)],
    voting='soft').fit(X_train, y_train)

y_pred_voting = voting_clf.predict(X_test)
print(accuracy_score(y_test, y_pred_voting))
Reference:

Estimator Error discussions

Sklearn development gruideline

Thanks !

With Sklearn ensemble voting classifier, it can use model._estimator_type = "classifier" to fix the type error, I'll try to add this type into a new pull request.

stanfordmlgroup / ngboost

estimator compatibility issues with sklearn #324