Closed Dmorgen closed 6 years ago
Delightful idea. We'll huddle the team and discuss. I love it.
I'm not too familiar with spectral libraries so I'm going to ask some rather ignorant questions to get started.
https://www.biorxiv.org/content/early/2018/03/07/277822
Comprehensive peptide quantification for data independent acquisition mass spectrometry using chromatogram libraries View ORCID ProfileBrian C Searle, Lindsay K Pino, Jarrett D Egertson, Ying S Ting, Robert T Lawrence, View ORCID ProfileJudit Villen, View ORCID ProfileMichael J MacCoss
Hi,
I think I might be as ignorant :)
my suggestion is to run a spectral library search, followed by GPTMD and regular search on those ions that were not identified by the spectral library searches. I think that for model animals, the precentage of PSMs that have been already ID'd in the past will become very dominant quickly. this should allow more complex GPTMD searches on smaller number of MS2 scans that were not ID'd in the first pass. I think spectral libraries should improve several factors: 1. search speed 2. improve search speed significantly on complex data, such as glycoproteomics and top-down 3. building consensus spectra from multiple IDs that will improve the confidence of the ID (via fragmentation intensity too). All of this is more important when considering larger peptide, glycopeptides and other labile and difficult modifications. I'm actually not interested in DIA.
Ideally, I would like to have an option to concatenate the results to a consensus spectral library. for example, if I search human data, i would like to have the strong IDs incorporated back to the spectral library automatically. Currently only SpectraST does it and not automatically, but via command line. I am playing a bit currently with SpectraST, but it is not very easy for me, mainly since the generation of libraries is difficult, performing the search (and FDR) is difficult and data visualization is difficult... I was actually trying to generate libraries for analysis in PD2.2, combining MSPepSearch with MSAmandata for sequential analysis. doesn't work (yet...). The output of interest, assuming you can combine it into the FDR and quantification pipeline, is both TSV and a format that can be viewed somehow (another sore point, I know...). My interest in pepXML was mainly since it can be view in Scaffold and Batmass (http://www.batmass.org/tutorial/overlay-peptide-ids-on-map2d/). If you plan on making a viewer of your own... :) i would say that the lack of a viewing option is a deterrent to some degree, especially with mods.
you can check with NIST: http://chemdata.nist.gov/dokuwiki/doku.php?id=peptidew:cdownload, and divided between instruments and organisms. I think most of the Orbitrap HCD data should be HiHi, but I'm not sure.
I like this idea and proteomics spectral library searching may be an intermediate stage on the way to MetaMorpheus-metabolomics. To be honest it will not likely happen soon, because we're focusing on fixing crashes/bugs and other stability issues, followed by making MM more user-friendly, in addition to our other non-MetaMorpheus projects and this is a fairly major change. It is a very interesting new feature, though. I'm especially interested in matching fragment intensities, though again this is dependent on fragmentation type (e.g., HCD vs CID), collision energy, etc. At the very least we can aid spectral library generation through outputting pepXML. I'm hoping to have that done relatively soon (1-2 months).
see #1124
How about hyphenating a spectral library search as first pass before going on to database searching? with so much data available today, it's almost a crime not to combine it into a search pipeline... yet there is none that allow good LFQ and search engine coupled to it...
Cheers, David.