I extended the filter module to filter osw files. This is useful for DIAlignR, for memory consumption and smaller file size handling. I filter on the QVALUE instead of the PEPs, mainly because DIAlignR would filter based on the QVALUE.
Extended filter Module
pyprophet filter --help
Usage: pyprophet filter [OPTIONS] [SQLDBFILES]...
Filter sqMass files or osw files
Options:
--in PATH PyProphet input file.
--max_precursor_pep FLOAT Maximum PEP to retain scored precursors in
sqMass. [default: 0.7]
--max_peakgroup_pep FLOAT Maximum PEP to retain scored peak groups in
sqMass. [default: 0.7]
--max_transition_pep FLOAT Maximum PEP to retain scored transitions in
sqMass. [default: 0.7]
--remove_decoys / --no-remove_decoys
Remove Decoys from OSW file. [default:
remove_decoys]
--omit_tables TEXT Tables in the database you do not want to
copy over to filtered file. i.e.
`--omit_tables '["FEATURE_TRANSITION",
"SCORE_TRANSITION"]'` [default: []]
--max_gene_fdr FLOAT Maximum QVALUE to retain scored genes in
OSW. [default: None]
--max_protein_fdr FLOAT Maximum QVALUE to retain scored proteins in
OSW. [default: None]
--max_peptide_fdr FLOAT Maximum QVALUE to retain scored peptides in
OSW. [default: None]
--max_ms2_fdr FLOAT Maximum QVALUE to retain scored MS2 Features
in OSW. [default: None]
--help Show this message and exit.
Example
pyprophet filter --max_peptide_fdr 0.05 --max_ms2_fdr 0.1 *.osw
[2022-11-26 05:22:59] INFO: Begin filtering merged.osw to merged_filtered.osw...
[2022-11-26 05:22:59] INFO: Filtering for 968 peptide ids with peptide score q-value <= 0.05 with decoy removal = True...
[2022-11-26 05:23:00] INFO: Filtering for 12239 feature ids across 1444 unique precursor ids with ms2 score q-value <= 0.1 with decoy removal = True...
[2022-11-26 05:23:19] INFO: Filtering for 1188444 transition ids for 968 peptides ids and 1444 precursor ids...
[2022-11-26 05:23:56] INFO: Finished filtering merged.osw to merged_filtered.osw...
$ ls -ltrh *.osw
-rw-r--r-- 1 root root 5.2G Nov 24 21:29 merged.osw
-rw-r--r-- 1 root root 1022M Nov 26 00:23 merged_filtered.osw
I extended the
filter
module to filter osw files. This is useful for DIAlignR, for memory consumption and smaller file size handling. I filter on the QVALUE instead of the PEPs, mainly because DIAlignR would filter based on the QVALUE.Extended filter Module
Example