cognoma / machine-learning

Machine learning for Project Cognoma
Other
32 stars 47 forks source link

Passive Aggressive Classifier #69

Closed KT12 closed 7 years ago

KT12 commented 8 years ago

Followed template provided by @dhimmel to implement classification with multilayer perceptron. All of the preprocessing steps remain the same.

The AUROC score worked for the CV violin plot and heat map but not for the ROC curve. I have replaced the AUROC curve with an F1 Micro Score for all CV analysis.

I have also provided a manual log of best parameters. This can be found in the section following model fitting.

Thank you for reviewing.

dhimmel commented 8 years ago

Nice! Can't wait to take a look.

First, can you rename WIP-MultilayerPerceptron-KT12 (F1micro).ipynb, so it doesn't have any spaces or parentheses. We like to keep our file names shell friendly.

Also export a script file as described here.

dhimmel commented 8 years ago

Make sure to also remove the now outdated algorithms/WIP-MultilayerPerceptron-KT12 (F1micro).ipynb.

KT12 commented 8 years ago

Renamed the file, deleted the old, and added the converted it to a script for diff viewing.

Sorry I missed the part about converting to a script. Thank you!

KT12 commented 7 years ago

I have updated the method for importing packages and expanded the grid search for 'C' values in the passive aggressive classifier.

For the MLP, should I continue to troubleshoot? Thank you.

KT12 commented 7 years ago

The Passive Aggressive Classifier is ready.

I completely removed the multilayer perceptron (for now). Maybe I'll come back to it later.

Had to resort to git revert --hard to get my remote and master inline with each other.

dhimmel commented 7 years ago

Really nice job cleaning this pull request up. Unless you have any more changes you want to make, I'm going to go ahead and merge.

Also let me know if I'm on the right track with my thinking: PassiveAggressiveClassifier is fitting a linear model, just like when we use SGDClassifier. Although PassiveAggressiveClassifier doesn't support predict_proba, it does support decision_function so we can get ranked predictions. Like elastic net logistic regression using SGDClassifier, we can get feature weights, which help us understand which features are influential. The main advantage compared to SGDClassifier is that PassiveAggressiveClassifier models are much quicker to train.

KT12 commented 7 years ago

Thank you for reviewing. I think your intuition about the PA classifier is correct.

I thought I had previously posted this video and slide deck, but they provide a good overview.