PyProphet / pyprophet

PyProphet: Semi-supervised learning and scoring of OpenSWATH results.
http://www.openswath.org
BSD 3-Clause "New" or "Revised" License
29 stars 21 forks source link

Feature/filter osw #108

Closed singjc closed 1 year ago

singjc commented 1 year ago

I extended the filter module to filter osw files. This is useful for DIAlignR, for memory consumption and smaller file size handling. I filter on the QVALUE instead of the PEPs, mainly because DIAlignR would filter based on the QVALUE.

Extended filter Module

pyprophet filter --help
Usage: pyprophet filter [OPTIONS] [SQLDBFILES]...

  Filter sqMass files or osw files

Options:
  --in PATH                       PyProphet input file.
  --max_precursor_pep FLOAT       Maximum PEP to retain scored precursors in
                                  sqMass.  [default: 0.7]
  --max_peakgroup_pep FLOAT       Maximum PEP to retain scored peak groups in
                                  sqMass.  [default: 0.7]
  --max_transition_pep FLOAT      Maximum PEP to retain scored transitions in
                                  sqMass.  [default: 0.7]
  --remove_decoys / --no-remove_decoys
                                  Remove Decoys from OSW file.  [default:
                                  remove_decoys]
  --omit_tables TEXT              Tables in the database you do not want to
                                  copy over to filtered file. i.e.
                                  `--omit_tables '["FEATURE_TRANSITION",
                                  "SCORE_TRANSITION"]'`  [default: []]
  --max_gene_fdr FLOAT            Maximum QVALUE to retain scored genes in
                                  OSW.  [default: None]
  --max_protein_fdr FLOAT         Maximum QVALUE to retain scored proteins in
                                  OSW.  [default: None]
  --max_peptide_fdr FLOAT         Maximum QVALUE to retain scored peptides in
                                  OSW.  [default: None]
  --max_ms2_fdr FLOAT             Maximum QVALUE to retain scored MS2 Features
                                  in OSW.  [default: None]
  --help                          Show this message and exit.

Example

pyprophet filter --max_peptide_fdr 0.05 --max_ms2_fdr 0.1 *.osw
[2022-11-26 05:22:59] INFO: Begin filtering merged.osw to merged_filtered.osw...
[2022-11-26 05:22:59] INFO: Filtering for 968 peptide ids with peptide score q-value <= 0.05 with decoy removal = True...
[2022-11-26 05:23:00] INFO: Filtering for 12239 feature ids across 1444 unique precursor ids with ms2 score q-value <= 0.1 with decoy removal = True...
[2022-11-26 05:23:19] INFO: Filtering for  1188444 transition ids for 968 peptides ids and 1444 precursor ids...
[2022-11-26 05:23:56] INFO: Finished filtering merged.osw to merged_filtered.osw...
$ ls -ltrh *.osw
-rw-r--r-- 1 root root     5.2G Nov 24 21:29 merged.osw
-rw-r--r-- 1 root root 1022M Nov 26 00:23 merged_filtered.osw