Implementation of L1-regularized logistic regression + hyperparameter search in PyTorch (see utilities/classify_pytorch.py)
Comparison of PyTorch LR implementation against previous sklearn implementation for TP53 mutation detection (see pytorch_vs_sklearn.py and pytorch_analysis.ipynb).
This implementation will lay the groundwork for future PRs, which will extend these analyses to more complex models (multi-layer NNs, and/or custom loss functions incorporating pathway and network priors).
Results indicate that prediction quality (as measured by accuracy/AUROC/AUPRC) is approximately the same for both models, but the scikit-learn models are much sparser (many more 0 coefficients). Planning to look into this in future PRs, in addition to running the same experiments for other genes besides TP53.
This PR includes:
utilities/classify_pytorch.py
)pytorch_vs_sklearn.py
andpytorch_analysis.ipynb
).This implementation will lay the groundwork for future PRs, which will extend these analyses to more complex models (multi-layer NNs, and/or custom loss functions incorporating pathway and network priors).
Results indicate that prediction quality (as measured by accuracy/AUROC/AUPRC) is approximately the same for both models, but the scikit-learn models are much sparser (many more 0 coefficients). Planning to look into this in future PRs, in addition to running the same experiments for other genes besides TP53.