wfondrie / mokapot

Fast and flexible semi-supervised learning for peptide detection in Python
https://mokapot.readthedocs.io
Apache License 2.0
43 stars 15 forks source link
bioinformatics conda machine-learning peptide-detection percolator proteomics python

Fast and flexible semi-supervised learning for peptide detection.

mokapot is fundamentally a Python implementation of the semi-supervised learning algorithm first introduced by Percolator. We developed mokapot to add additional flexibility to our analyses, whether to try something experimental---such as swapping Percolator's linear support vector machine classifier for a non-linear, gradient boosting classifier---or to train a joint model across experiments while retaining valid, per-experiment confidence estimates. We designed mokapot to be extensible and support the analysis of additional types of proteomics data, such as cross-linked peptides from cross-linking mass spectrometry experiments. mokapot offers basic functionality from the command line, but using mokapot as a Python package unlocks maximum flexibility.

For more information, check out our documentation.

Citing

If you use mokapot in your work, please cite:

Fondrie W. E. & Noble W. S. mokapot: Fast and Flexible Semisupervised Learning for Peptide Detection. J Proteome Res (2021) doi: 10.1021/acs.jproteome.0c01010. PMID: 33596079. Link

Installation

mokapot requires Python 3.6+ and can be installed with pip or conda.

Using conda:

$ conda install -c bioconda mokapot

Using pip:

$ pip3 install mokapot

Additionally, you can install the development version directly from GitHub:

$ pip3 install git+git://github.com/wfondrie/mokapot

Basic Usage

Before you can use mokapot, you need PSMs assigned by a search engine available in the Percolator tab-delimited file format (often referred to as the Percolator input, or "PIN", file format) or as a PepXML file.

Simple mokapot analyses can be performed at the command line:

$ mokapot psms.pin

Alternatively, the Python API can be used to perform analyses in the Python interpreter and affords greater flexibility:

import mokapot
psms = mokapot.read_pin("psms.pin")
results, models = mokapot.brew(psms)
results.to_txt()

Check out our documentation for more details and examples of mokapot in action.