flennerhag / mlens

ML-Ensemble – high performance ensemble learning
http://ml-ensemble.com
MIT License
843 stars 108 forks source link

getting zero score accuracy on test data #136

Open jboverio opened 3 years ago

jboverio commented 3 years ago

Hi.

I am playing with ml ensemble in Kaggle but I keep getting 0 score accuracy on submission. I can't figure out what is wrong!

from mlens.ensemble import BlendEnsemble from sklearn.linear_model import LogisticRegression from sklearn.ensemble import RandomForestClassifier from sklearn.svm import SVC seed=3 def build_ensemble(proba, **kwargs): """Return an ensemble.""" estimators = [ SVC(probability=proba), CatBoostClassifier(iterations=300, logging_level='Silent', learning_rate= 0.03, depth=5)]

ensemble = BlendEnsemble(**kwargs)
ensemble.add(estimators, proba=proba)   # Specify 'proba' here
ensemble.add_meta(LogisticRegression())

return ensemble

ensemble_false = build_ensemble(proba=False) ensemblefalse.fit(X, Y)

preds_false = ensemblefalse.predict(X)

print("Accuracy:\n%r" % accuracy_score(preds_false, Y))

ensemble = buildensemble(proba=True) ensemble.fit(X, Y)

preds = ensemble.predict(X_)

print("\nAccuracy:\n%r" % accuracy_score(preds, Y))

Xtest = pd.read_csv('../input/labdata-churn-challenge-2020/test.csv')

Xtest = feat_engineering_types(Xtest) X_test_2 = trata_na(Xtest) X_test_3 = coloca_dummy(X_test_2) predicted_prices_false = ensemble_false.predict(X_test_3)

my_submission_false = pd.DataFrame({'id': Xtest.id, 'Churn': predicted_prices_false})

my_submission_false.to_csv('submission.csv', index=False) predicted_prices = ensemble.predict(X_test_3)

my_submission2 = pd.DataFrame({'id': Xtest.id, 'Churn': predicted_prices})

my_submission2.to_csv('submission2.csv', index=False)