Triqler is a probabilistic graphical model that propagates error information through all steps from MS1 feature to protein level, employing distributions in favor of point estimates, most notably for missing value imputation. The model outputs posterior probabilities for fold changes between treatment groups, highlighting uncertainty rather than hiding it.
For a detailed explanation of how to install and run Triqler (stand-alone or in combination with MaxQuant, Quandenser or Dinosaur) as well as how to interpret the results, please read our Triqler user manual.
Brief instructions for installing and running Triqler as well as descriptions of the input and output formats can be found below. Instructions for running the converters to the Triqler input format are available in our wiki.
The, M. & Käll, L. (2019). Integrated identification and quantification error probabilities for shotgun proteomics. Molecular & Cellular Proteomics, 18 (3), 561-570. https://doi.org/10.1074/mcp.RA118.001018
Truong, P., The, M., & Käll, L. (2023). Triqler for Protein Summarization of Data from Data-Independent Acquisition Mass Spectrometry. Journal of Proteome Research, 22 (4), 1359-1366. https://doi.org/10.1021/acs.jproteome.2c00607
pip
pip install triqler
git clone https://github.com/statisticalbiotechnology/triqler.git
cd triqler
pip install .
usage: triqler [-h] [--out_file OUT] [--fold_change_eval F]
[--decoy_pattern P] [--missing_value_prior D] [--min_samples N]
[--num_threads N] [--ttest] [--write_spectrum_quants]
[--write_protein_posteriors P_OUT]
[--write_group_posteriors G_OUT]
[--write_fold_change_posteriors F_OUT]
[--csv-field-size-limit CSV_FIELD_SIZE_LIMIT]
IN_FILE
positional arguments:
IN_FILE List of PSMs with abundances (not log transformed!)
and search engine score. See README for a detailed
description of the columns.
optional arguments:
-h, --help show this help message and exit
--out_file OUT Path to output file (writing in TSV format). N.B. if
more than 2 treatment groups are present, suffixes
will be added before the file extension. (default:
proteins.tsv)
--fold_change_eval F log2 fold change evaluation threshold. (default: 1.0)
--decoy_pattern P Prefix for decoy proteins. (default: decoy_)
--missing_value_prior D
Distribution to fit for missing value prior. Use "DIA"
for using means of NaNs to fit the censored normal
distribution. The "default" option fits the censored
normal distribution with all observed XIC values.
(default: default)
--min_samples N Minimum number of samples a peptide needed to be
quantified in. (default: 2)
--num_threads N Number of threads, by default this is equal to the
number of CPU cores available on the device. (default:
6)
--ttest Use t-test for evaluating differential expression
instead of posterior probabilities. (default: False)
--write_spectrum_quants
Write quantifications for consensus spectra. Only
works if consensus spectrum index are given in input.
(default: False)
--write_protein_posteriors P_OUT
Write raw data of protein posteriors to the specified
file in TSV format. (default: )
--write_group_posteriors G_OUT
Write raw data of treatment group posteriors to the
specified file in TSV format. (default: )
--write_fold_change_posteriors F_OUT
Write raw data of fold change posteriors to the
specified file in TSV format. (default: )
--csv-field-size-limit CSV_FIELD_SIZE_LIMIT
Set a new maximum CSV field size (default: None)
A sample file iPRG2016.tsv
is provided in the example
folder. You
can run Triqler on this file by running the following command:
python -m triqler --fold_change_eval 0.8 example/iPRG2016.tsv
A detailed example of the different levels of Triqler output can be found in Supplementary Note 2 of the Quandenser publication.
The simplest input format is a tab-separated file consisting of a header line followed by one PSM per line in the following format:
run <tab> condition <tab> charge <tab> searchScore <tab> intensity <tab> peptide <tab> proteins
r1 <tab> 1 <tab> 2 <tab> 1.345 <tab> 21359.123 <tab> A.PEPTIDE.A <tab> proteinA <tab> proteinB
r2 <tab> 1 <tab> 2 <tab> 1.945 <tab> 24837.398 <tab> A.PEPTIDE.A <tab> proteinA <tab> proteinB
r3 <tab> 2 <tab> 2 <tab> 1.684 <tab> 25498.869 <tab> A.PEPTIDE.A <tab> proteinA <tab> proteinB
...
r1 <tab> 1 <tab> 3 <tab> 0.452 <tab> 13642.232 <tab> A.NTPEPTIDE.- <tab> decoy_proteinA
Alternatively, if you have match-between-run probabilities, a slightly more complicated input format can be used as input:
run <tab> condition <tab> charge <tab> searchScore <tab> spectrumId <tab> linkPEP <tab> featureClusterId <tab> intensity <tab> peptide <tab> proteins
r1 <tab> 1 <tab> 2 <tab> 1.345 <tab> 3 <tab> 0.0 <tab> 1 <tab> 21359.123 <tab> A.PEPTIDE.A <tab> proteinA <tab> proteinB
r2 <tab> 1 <tab> 2 <tab> 1.345 <tab> 3 <tab> 0.021 <tab> 1 <tab> 24837.398 <tab> A.PEPTIDE.A <tab> proteinA <tab> proteinB
r3 <tab> 2 <tab> 2 <tab> 1.684 <tab> 4 <tab> 0.0 <tab> 1 <tab> 25498.869 <tab> A.PEPTIDE.A <tab> proteinA <tab> proteinB
...
r1 <tab> 1 <tab> 3 <tab> 0.452 <tab> 6568 <tab> 0.15 <tab> 9845 <tab> 13642.232 <tab> A.NTPEPTIDE.- <tab> decoy_proteinA
Some remarks:
The output format is a tab-separated file consisting of a header line followed by one protein per line in the following format:
q_value <tab> posterior_error_prob <tab> protein <tab> num_peptides <tab> protein_id_PEP <tab> log2_fold_change <tab> diff_exp_prob_<FC> <tab> <condition1>:<run1> <tab> <condition1>:<run2> <tab> ... <tab> <conditionM>:<runN> <tab> peptides
Some remarks: