levitsky / pyteomics

Pyteomics is a collection of lightweight and handy tools for Python that help to handle various sorts of proteomics data. Pyteomics provides a growing set of modules to facilitate the most common tasks in proteomics data analysis.
http://pyteomics.readthedocs.io
Apache License 2.0
115 stars 35 forks source link

tandem is_decoy function #2

Closed tivdnbos closed 4 years ago

tivdnbos commented 4 years ago

Dear all,

I am using Tandem XML files where the decoy proteins are indicated as follows: protein: [{ (...), 'label': 'tr|B0MJG7_REVERSED|B0MJG7_REVERSED_9FIRM Selenium-dependent molybdenum hydroxylase...', 'note': 'tr|B0MJG7_REVERSED|B0MJG7_REVERSED_9FIRM Selenium-dependent molybdenum hydroxylase 1 OS=Anaerostipes caccae DSM 14662 GN=ANACAC_03779 PE=4 SV=1' , (...)}].

Therefore, I would like to write a is_decoy function, but Iḿ not sure what to pass by. This could be a lambda expression or boolean evaluation?

Thanks for helping me out!

levitsky commented 4 years ago

Hi,

If you are using tandem.filter or tandem.qvalues, then your is_decoy function (or lambda) can look like this:

result = tandem.filter(..., is_decoy=lambda psm: all('_REVERSED_' in prot['label'] for prot in psm['protein']))

tivdnbos commented 4 years ago

Thanks a lot for the quick response!

All the best, Tim