bigbio / proteomics-sample-metadata

The Proteomics sample metadata: Standard for experimental design annotation in proteomics datasets
GNU General Public License v2.0
76 stars 107 forks source link

Annotation of the spectral ProteomeTools datasets #266

Open ypriverol opened 4 years ago

ypriverol commented 4 years ago

http://www.proteometools.org/index.php?id=52

ypriverol commented 4 years ago

@patroklossam I have seen two projects annotated. We have other 3 projects that would be great to have them annotated. Let me know if you can find some time to perform the annotations.

patroklossam commented 4 years ago

@ypriverol Yup I am a bit busy these days but I will annotate them. Just FYI, PXD005336 is not a ProteomeTools dataset. I will also annotate this though

levitsky commented 4 years ago

I've been meaning to annotate 5336 but haven't quite got around to it yet. I did some preliminary work to extract the annotations from ProteomicsDB (couldn't find a proper way so ended up scraping the site). But, if @patroklossam has all this info in hand then I'll be happy to collaborate on some draft annotation from you.

patroklossam commented 4 years ago

@levitsky All the relevant information is indeed in ProteomicsDB but not accessible via the API... I am working currently on an API that will export any project in SDRF format but was waiting till the format is stable :) (and we need to map ProteomicsDB projects to PXD identifiers) Once I have something ready (especially for this one) I will create a PR so that we can check and discuss anything that needs to be changed/ added! thanks for the help!!

veitveit commented 3 years ago

Isn't there something wrong with the annotation of the fragmentations?

For example, I can see that some files were run with ETD, ETciD and EThcD at the same time? Somehow the description does not fit the annotations but I do not see how one could get the full information.

patroklossam commented 3 years ago

Hi Veit!

ProteomeTools data was aquired on a flexible tribrid instrument, with multiple fragmentation modes/energies triggered on every precursor to generate a comprehensive characterization of every peptide using multiple fragmentation modes/energies.

There are usually 4 different runs per pool of 1000 synthetic peptides, the modes used are somewhat annotated in the rawfile name:

  1. Files named "DDA" contain 2 different MS2 fragmentation events for every precursor, one HCD (or "beam-type CID") event at collision energy 28 and readout in the Orbitrap and one CID (or "resonance-type CID) event at collision energy 35 and readout in the Iontrap.
  2. Files named "3xHCD" contain 3 different MS2 fragmentation events for every precursor, one HCD (or "beam-type CID") event collision energy 25 and readout in the Orbitrap, one HCD (or "beam-type CID") event at collision energy 30 and readout in the Orbitrap and one HCD (or "beam-type CID") event at collision energy 35 and readout in the Orbitrap
  3. Files named "ETD" contain 3 different MS2 fragmentation events for every precursor, one ETD event with readout in the Orbitrap, one EThcD (ETD with supplemental HCD activation at collision energy 28) event with readout in the Orbitrap and one ETciD (ETD with supplemental CID activation at collision energy 35) event with readout in the Orbitrap.
  4. Files named "2xIT_2x_HCD" contain 4 different MS2 fragmentation events for every precursor, one HCD (or "beam-type CID") event collision energy 20 and readout in the Orbitrap, one HCD (or "beam-type CID") event at collision energy 23 and readout in the Orbitrap, one HCD (or "beam-type CID") event at collision energy 28 and readout in the Iontrap and one CID(or "resonance-type CID") event at collision energy 35 and readout in the Iontrap.

you can also check Supplementary Figure 3a from the paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5868332/bin/NIHMS951387-supplement-Supplemental_Figures.pdf

veitveit commented 3 years ago

Hi Patroklos, Thanks a lot for the quick and informative reply!

We now understood the setup.