LinearSVC: added more evaluation metrics

htcai commented 8 years ago

Sorry that it took me a couple of days before I re-sent the pull request. I spent some time getting familiar with some mathematical details about the optimization of SVMs.

I mainly was manipulating two hyper-parameters: the penalty parameter C and the scaling of intercept intercept_scaling. Both ranges from 1e-5 to 10. The former reaches the optimal results at 0.01 whereas the latter does not seem to have much effect.

The testing AUROC appears to be fine (89.2%), comparing with the training AUROC (92.0%).

Since the prediction of SVMs is always in {0, 1}, it might not be as informative as those models providing probabilistic predictions, especially when cognoma users are querying a single case. For the same reason, I removed the last cell which plots the probabilistic distribution of predictions. However, I added a confusion matrix for the testing data and calculated the F1 score. I think F1 score may also serve as an indication of the confidence of predicting.

I would like to explore other kernel methods if you guys think it is worth a try.

dhimmel commented 8 years ago

Besides the tabulate import, everything looks good to me.

I would like to explore other kernel methods if you guys think it is worth a try.

Sure, why don't we merge this pull request first, and then you can submit an additional one which varies the kernel. I think that could be really valuable.

htcai commented 8 years ago

@dhimmel Thanks for your suggestion! That is very reasonable. I have revised the cell. Also, I am happy to try other kernels, which are very interesting to me.

I was struggling with pushing my latest commit for some time. It appeared that the data files are included in the commits, whereas I cannot find it in my local directory. Therefore, I reset the local repo again and then pasted my revised notebook back to the algorithms folder.

dhimmel commented 8 years ago

Okay, if you do a push --force with the new (clean) repo to your LinearSVC branch, the changes will show up in this pull request.

htcai commented 8 years ago

Yes, I have already forced the push with my clean local repo. I can see the latest commit in this pull request.

dhimmel commented 8 years ago

Cool & congrats on the pull request. Will merge now.

htcai commented 8 years ago

Thanks for your feedback and suggestions! I have already learnt a lot by reading and manipulating the notebook.

cognoma / machine-learning

LinearSVC: added more evaluation metrics #33