PyTorch logistic regression - implementation and comparison against scikit-learn

This PR includes:

Implementation of L1-regularized logistic regression + hyperparameter search in PyTorch (see utilities/classify_pytorch.py)
Comparison of PyTorch LR implementation against previous sklearn implementation for TP53 mutation detection (see pytorch_vs_sklearn.py and pytorch_analysis.ipynb).

This implementation will lay the groundwork for future PRs, which will extend these analyses to more complex models (multi-layer NNs, and/or custom loss functions incorporating pathway and network priors).

Results indicate that prediction quality (as measured by accuracy/AUROC/AUPRC) is approximately the same for both models, but the scikit-learn models are much sparser (many more 0 coefficients). Planning to look into this in future PRs, in addition to running the same experiments for other genes besides TP53.

greenelab / netscape

PyTorch logistic regression - implementation and comparison against scikit-learn #11