Rerun mutation prediction + compare optimizers

In #68, in addition to what I listed in the PR description, I also tried running MSI prediction with a different sklearn interface/optimizer. Generally we've been running most experiments using SGDClassifier, which optimizes the logistic loss using stochastic gradient descent. Instead I tried using LogisticRegression with an L1 penalty using the liblinear optimizer, which uses a coordinate descent algorithm that's supposed to converge quickly but can scale worse to datasets with many samples.

Since performance was generally better for MSI prediction with LogisticRegression, but not that much better, in this PR I reran the mutation prediction experiments from #65 using LogisticRegression, and compared the results between the two optimizers in the notebooks 02_cancer_type_classification/lasso_range_analysis/compare_optimizers_all.ipynb and 02_cancer_type_classification/lasso_range_analysis/compare_optimizers_gene.ipynb.

In general, it does seem like the liblinear optimizer results in a better fit for almost every gene:

In this plot, each sample is a gene/cancer type combination, and a positive value means liblinear performed better than sgd for the best-performing LASSO parameters using each optimizer.

greenelab / pancancer-evaluation

Rerun mutation prediction + compare optimizers #71