cognoma / machine-learning

Machine learning for Project Cognoma
Other
32 stars 47 forks source link

Create a table classifying the different algorithms #5

Closed hhummel closed 8 years ago

hhummel commented 8 years ago

Columns including binary classification, how hard to implement and estimate of quality

gwaybio commented 8 years ago

@dhimmel this sounds like a task to me :)

yl565 commented 8 years ago

Some popular algorithms that has been already implemented in sklearn:

Type Algorithm Implementation Quality
Discriminant Analysis LDA/QDA sklearn.discriminant_analysis
Generalized Linear Models Logistic regression with L1/L2 regularization sklearn.linear_model.LogisticRegressionCV
Generalized Linear Models OLS sklearn.linear_model.LinearRegression
Generalized Linear Models Rridge sklearn.linear_model.RridgeCV
Generalized Linear Models LASSO sklearn.linear_model.LassoCV
Generalized Linear Models Elastic Net sklearn.linear_model.ElasticNetCV
Robust RANSAC sklearn.linear_model.RANSACRegressor
Support Vector Machine SVM sklearn.svm.SVC (libsvm)
Support Vector Machine Linear SVM with regularization sklearn.svm.LinearSVC (liblinear)
Decision Trees CART sklearn.tree.DecisionTreeClassifier
Ensemble Bagging/Random Subspace sklearn.ensemble.BaggingClassifier
Ensemble RandomForest sklearn.ensemble.RandomForestClassifier
Ensemble Adaboost sklearn.ensemble.AdaBoostClassifier
Ensemble Voting sklearn.ensemble.VotingClassifier

Algorithms that need implementation

Type Algorithm Implementation Quality
Ensemble Stacking Easy
hhummel commented 8 years ago

I went through the supervised learning classifiers in scikit-learn in the User's Guide, capturing the highlights of the blurb and url of the documentation.

Classifiers.xlsx

hhummel commented 8 years ago

I don't have a good enough feel for the problem we are solving or the characteristics of the data, but I like what they say about linear and quadratic discriminant analysis: "These classifiers are attractive because they have closed-form solutions that can be easily computed, are inherently multiclass, have proven to work well in practice and have no hyperparameters to tune."

dhimmel commented 8 years ago

I think we now have a good handle on what's available. Closing this issue, but feel free to continue discussion.