cognoma / machine-learning

Machine learning for Project Cognoma
Other
32 stars 47 forks source link

Claim an sklearn algorithm to implement and troubleshoot #27

Open dhimmel opened 8 years ago

dhimmel commented 8 years ago

In the August 26 meetup, we discussed having each team member in the machine learning group claim an algorithm. We've made lot's of progress on the example notebook (1.TCGA-MLexample.ipynb) since then (see #18 & #25). Currently, 1.TCGA-MLexample.ipynb uses elastic net logistic regression implemented in SGDClassifier.

The goal of this repository is for people to:

  1. Claim an algorithm. See the list of classifiers at https://github.com/cognoma/machine-learning/issues/5#issuecomment-235069679. The main requirement is that the algorithm uses the sklearn API so we can use it in the pipeline. Make a comment here once you've chosen an algorithm.
  2. Create a modified version of 1.TCGA-MLexample.ipynb in an algorithms directory. So if I took the SVM classifier, I would copy 1.TCGA-MLexample.ipynb to algorithms/SVC-dhimmel.ipynb. Then I would make my edits to algorithms/SVC-dhimmel.ipynb to switch to an SVC classifier.
  3. Your goal should be to pick a good set of parameters for grid search. It would also be great if you could document what seems to work well about the algorithm (or if it doesn't seem to work well).

Best of luck! If you can work on this before the August 9 meetup then great! Otherwise make sure to bring a laptop with the cognoma-machine-learning environment installed.

dhimmel commented 8 years ago

Tagging everyone who said they were interested in contributing to the machine learning part of the project in the introduction thread: @htcai, @loucru1 @bmcgeehan @umeshiso @swbiggs4 @danrieman @rramyr @Inquisitive-Geek @brankaj @yl565 @Ramaa-Nathan @ejsegall @FadiAlnabolsi @sameertipnis @VijYadav @ctipnis.

Update: also tagging @yigalron.

htcai commented 8 years ago

Thanks for the sample notebook! I would like to implement Linear SVM with regularization.

dhimmel commented 8 years ago

@htcai awesome. Can you specify which sklearn function(s) you plan to use for the model?

brankaj commented 8 years ago

Thanks for setting this up. I would like to try LASSO, implementation sklearn.linear_model.LassoCV.

htcai commented 8 years ago

I consulted the chart posted by @yl565 . I plan to try sklearn.svm.LinearSVC.

yigalron commented 8 years ago

I plan to implement the Nearest Neighbors Classification

dhimmel commented 8 years ago

@yigalron did you want to claim KNeighborsClassifier, RadiusNeighborsClassifier, or both?

yigalron commented 8 years ago

I'm planning to start with the KNeighborsClassifier

VijYadav commented 8 years ago

I plan to test Decision Tree CART (sklearn.tree.DecisionTreeClassifier) algorithm

VijYadav commented 8 years ago

Hi Daniel, Is there "algorithms" directory created already? I I can't see it. Sorry, I am still learning about Github.

dhimmel commented 8 years ago

@VijYadav, you'll have to create the directory, since it currently doesn't exist. I'll submit a pull request as an example.

yigalron commented 8 years ago

when trying to set up the conda environment on windows (conda env create --quiet --force --file environment.yml) I get an error:

yaml.scanner.ScannerError: mapping values are not allowed here in "", line 7, column 19: <head prefix="og: http://ogp.me/ns# fb: http://o ... ^

any ideas what I did wrong?

dhimmel commented 8 years ago

@yigalron can you file a new issue or comment on #15 with the conda installation issue? It's best to keep issues focused and uncluttered.

yigalron commented 8 years ago

OK; anyway it seems to have been a user error; I'm trying again

beelze-b commented 8 years ago

Hi all, I will claim AdaBoost.

mans2singh commented 8 years ago

Hey Folks - I will give RandomForestClassifier a shot. Mans

yigalron commented 8 years ago

I have an initial version of the K-Nearest neighbor algorithm, and I issued a pull request; not sure if it got into the master. I won't be able to join the next meeting, but will continue to work on this remotely.

George-Zipperlen commented 8 years ago

I would like to claim spectral clustering, and give it a try

KT12 commented 8 years ago

I would like to work on the multi-layer perceptron classifier.

sklearn.neural_network.MLPClassifier

davidrichardsteinmetz commented 8 years ago

I'd like to claim LDA/QDA and give it a shot

KT12 commented 8 years ago

I'll also take a look at the Passive Aggressive Classifier