compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
48 stars 18 forks source link

Missing data in mzIdentML 1.2 export for Phosphorylation studies #270

Closed david-bouyssie closed 7 years ago

david-bouyssie commented 7 years ago

Hi guys,

I tried to export a dataset in mzIdentML 1.2 format. My datasets contains Phosphorylations on S, T and Y. So I expect to have the scores values for these amino acids and at different positions of the considered peptide sequence. However some of the scores are not exported, whereas they are present in the Excel spreadsheet exported by PeptideShaker.

Here is an .mzid XML chunk of a peptide sequence ICDFGSASHVADNDITPYIVSR modified by multiple Phosphorylations where we only have CvParams for position 18 but not for positions 6, 8, 16 and 21:

<cvParam cvRef="PSI-MS" accession="MS:1001969" name="phosphoRS score" value="3:31.926722969429843:18:false"/>
<cvParam cvRef="PSI-MS" accession="MS:1002550" name="peptide:phosphoRS score" value="3:49.948937867955685:18:false"/>
<SpectrumIdentificationResult spectraData_ref="OFMLP161222_11.raw.mzDB.mgf" spectrumID="index=39065" id="SIR_34951">
                <SpectrumIdentificationItem passThreshold="false" rank="1" peptide_ref="ICDFGSASHVADNDITPYIVSR_79.96633052074999" calculatedMassToCharge="916.0963552208455" experimentalMassToCharge="916.0981" chargeState="3" id="SII_34951_1">
                    <PeptideEvidenceRef peptideEvidence_ref="PepEv_35705"/>
                    <Fragmentation>
                        <IonType charge="1" index="3 4 5 6 7 8 9 11 14">
                            <FragmentArray measure_ref="Measure_MZ" values="600.3 747.2 804.4 891.5 962.4 1049.5 1186.7 1356.6 1700.9"/>
                            <FragmentArray measure_ref="Measure_Int" values="13.0 166.0 96.0 200.0 118.0 193.0 148.0 42.0 18.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.0012994240721582173 -0.16971333706214864 0.008822942367828546 0.07679453809782899 -0.0603192466122664 0.0076523491177340475 0.14874049066770567 -0.056787207032584774 0.14639930416751668"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(2)O"/>
                        </IonType>
                        <IonType charge="2" index="4 6 7 8 9 10 11 12 13 14 15 16 17 18 19">
                            <FragmentArray measure_ref="Measure_MZ" values="374.3 446.7 482.2 525.2 594.0 643.7 678.8 736.7 793.4 851.1 907.2 958.2 1006.4 1127.9 1185.0"/>
                            <FragmentArray measure_ref="Measure_Int" values="28.0 52.0 14.0 188.0 643.0 99.0 70.0 58.0 87.0 442.0 141.0 486.0 58.0 59.0 99.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="0.11150509806293485 0.48475903564292366 0.46620214328788734 -0.04981205884712381 0.2207320119277938 0.3865250554327986 -0.032031836922328694 0.3544966511627763 0.033032930592639786 0.2195614186776993 -0.222470569887264 0.25369019590766584 -0.07269172851738404 -0.08752125516730302 0.47044675626762"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(2)O"/>
                        </IonType>
                        <IonType charge="2" index="18 20 21">
                            <FragmentArray measure_ref="Measure_MZ" values="1097.2 1202.9 1246.2"/>
                            <FragmentArray measure_ref="Measure_Int" values="55.0 15.0 32.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="0.19036166335763482 -0.18587728170223272 -0.40189148383728934"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="HO(3)P"/>
                        </IonType>
                        <IonType charge="1" index="1 3 4 5 6 7 8 9 10 11 13 16">
                            <FragmentArray measure_ref="Measure_MZ" values="343.2 618.3 765.6 822.5 909.6 980.5 1067.3 1204.5 1303.6 1374.6 1604.0 1933.0"/>
                            <FragmentArray measure_ref="Measure_Int" values="24.0 783.0 87.0 110.0 28.0 181.0 472.0 35.0 59.0 223.0 34.0 29.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.05427257866210766 -0.01186410777211222 0.21972197923787462 0.09825825866789728 0.16622985439789772 0.029116069687802337 -0.20291233458237912 -0.06182419303240749 -0.030238106022579814 -0.06735189073265246 0.26277764429732997 0.10409217492724565"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                        </IonType>
                        <IonType charge="2" index="4 6 7 9 10 11 12 13 14 16 17 19 20 21">
                            <FragmentArray measure_ref="Measure_MZ" values="383.1 455.1 490.3 603.1 652.3 688.0 745.2 802.8 859.7 966.8 1015.2 1193.3 1243.1 1287.0"/>
                            <FragmentArray measure_ref="Measure_Int" values="24.0 22.0 24.0 1228.0 433.0 478.0 515.0 343.0 292.0 127.0 179.0 65.0 155.0 36.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.09377724378703078 -0.12052330620701923 -0.4390801985620669 0.3154496700777827 -0.018757286417326213 0.16268582122768294 -0.15078569068725756 0.4277505887425832 -0.1857209231723118 -0.15159214594245896 -0.27797407036734967 -0.23483558558245932 0.03095745792256821 0.414943255787648"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                        </IonType>
                        <IonType charge="1" index="14 16">
                            <FragmentArray measure_ref="Measure_MZ" values="1702.1 1915.8"/>
                            <FragmentArray measure_ref="Measure_Int" values="21.0 47.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="0.36238372147727205 -0.06935872406279486"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(3)N"/>
                        </IonType>
                        <IonType charge="2" index="13 14 15 16 18 19 20">
                            <FragmentArray measure_ref="Measure_MZ" values="793.4 851.1 908.1 958.2 1128.7 1185.0 1234.6"/>
                            <FragmentArray measure_ref="Measure_Int" values="87.0 442.0 279.0 486.0 16.0 99.0 43.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.4589748607523916 -0.27244637266733207 0.1855216387676819 -0.23831759543736553 0.22047095348762014 -0.02156103507741136 0.04423200842757069"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001224" name="frag: b ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(3)N"/>
                        </IonType>
                        <IonType charge="1" index="6 7 10 12">
                            <FragmentArray measure_ref="Measure_MZ" values="734.4 835.3 1178.1 1363.8"/>
                            <FragmentArray measure_ref="Measure_Int" values="24.0 186.0 46.0 17.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.01955084990220257 -0.16722931831225196 0.4788362395877357 0.11477943104773658"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="HO(3)P"/>
                        </IonType>
                        <IonType charge="2" index="10 12 14 16 17 18 20">
                            <FragmentArray measure_ref="Measure_MZ" values="589.6 682.3 800.5 879.5 922.5 951.3 1082.2"/>
                            <FragmentArray measure_ref="Measure_Int" values="290.0 45.0 13.0 154.0 43.0 59.0 191.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="0.28577988638789975 -0.046248517882190754 0.09008859639777711 0.05551750190772964 -0.4604967002272815 -0.17122856051230428 -0.31890702892224"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="HO(3)P"/>
                        </IonType>
                        <IonType charge="1" index="3 4 5 6 8 9 10 11 12 13">
                            <FragmentArray measure_ref="Measure_MZ" values="361.2 474.2 717.4 814.4 1028.6 1143.7 1257.8 1372.6 1443.7 1542.3"/>
                            <FragmentArray measure_ref="Measure_Int" values="48.0 177.0 73.0 776.0 403.0 317.0 271.0 274.0 462.0 75.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.019394491372111133 -0.10345846850214002 0.0668824781978401 0.01411862934776309 0.08237618380758249 0.15543315997774698 0.21250571883774683 -0.014437304992270583 0.04844891029779319 -0.41996500269237913"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                        </IonType>
                        <IonType charge="2" index="6 9 11 12 14 15 18 19 21">
                            <FragmentArray measure_ref="Measure_MZ" values="407.4 572.5 686.4 722.5 840.5 884.3 991.8 1065.3 1202.9"/>
                            <FragmentArray measure_ref="Measure_Int" values="13.0 55.0 54.0 33.0 56.0 860.0 41.0 85.0 15.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.2965789187321093 0.22407834658281445 -0.41085688590214886 0.17058622174283755 0.10692333602275994 0.3909091338877033 0.34560617911267855 0.31139922261763786 0.3826034580627038"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                        </IonType>
                        <IonType charge="1" index="4 6 8 9 10 11 12 13">
                            <FragmentArray measure_ref="Measure_MZ" values="457.5 797.4 1011.7 1126.4 1240.6 1355.7 1426.7 1525.6"/>
                            <FragmentArray measure_ref="Measure_Int" values="51.0 136.0 189.0 224.0 91.0 66.0 170.0 24.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="0.2230906325078763 0.04066773035776805 0.20892528481783756 -0.11801773901220258 0.03905481984770631 0.1121117960178708 0.07499801130779815 -0.09341590168241964"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(3)N"/>
                        </IonType>
                        <IonType charge="2" index="10 12 13 14 15 17 19 21">
                            <FragmentArray measure_ref="Measure_MZ" values="620.3 714.3 763.3 831.5 875.3 954.1 1056.9 1194.5"/>
                            <FragmentArray measure_ref="Measure_Int" values="71.0 158.0 20.0 36.0 248.0 70.0 167.0 155.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.48411082348218315 0.48386077224779456 -0.05034618424724613 -0.3798021134722376 -0.0958163156072942 -0.33038741009727346 0.42467377312277677 0.4958780085676153"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(3)N"/>
                        </IonType>
                        <IonType charge="1" index="3 4 5 6 7 8 9 10 11 12">
                            <FragmentArray measure_ref="Measure_MZ" values="343.2 456.0 699.4 796.4 897.0 1010.4 1125.4 1239.9 1354.7 1425.7"/>
                            <FragmentArray measure_ref="Measure_Int" values="24.0 55.0 70.0 126.0 489.0 127.0 89.0 67.0 23.0 22.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="-0.008829807672100287 -0.2928937848021178 0.0774471618977941 0.024683313047717093 -0.4229951553622868 -0.1070591324922816 -0.13400215632213985 0.32307040253795094 0.09612737870793353 0.05901359399786088"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(2)O"/>
                        </IonType>
                        <IonType charge="2" index="10 13 14 15 17 18 19 21">
                            <FragmentArray measure_ref="Measure_MZ" values="620.3 763.3 831.5 875.3 954.1 982.4 1055.7 1193.3"/>
                            <FragmentArray measure_ref="Measure_Int" values="71.0 20.0 36.0 248.0 70.0 96.0 40.0 65.0"/>
                            <FragmentArray measure_ref="Measure_Error" values="0.007896967862848214 0.44166160709778524 0.11220567787279379 0.39619147573773716 0.1616203812477579 -0.04911147903726487 -0.28331843553223734 -0.21211420008739879"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1001220" name="frag: y ion"/>
                            <cvParam cvRef="PSI-MS" accession="MS:1000336" name="neutral loss" value="H(2)O"/>
                        </IonType>
                    </Fragmentation>
                    <cvParam cvRef="PSI-MS" accession="MS:1002466" name="PeptideShaker PSM score" value="7.982"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002467" name="PeptideShaker PSM confidence" value="70.3755"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1001969" name="phosphoRS score" value="3:31.926722969429843:18:false"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002536" name="D-Score" value="3:47.57446808510638:18:false"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002469" name="PeptideShaker peptide confidence" value="99.37369519832986"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002468" name="PeptideShaker peptide score" value="26.81248883337892"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002500" name="peptide passes threshold" value="true"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002520" name="peptide group ID" value="ICDFGSASHVADNDITPYIVSR_79.96633052074999"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002550" name="peptide:phosphoRS score" value="3:49.948937867955685:18:false"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002553" name="peptide:D-Score" value="3:47.57446808510638:18:false"/>
                    <userParam name="MS Amanda e-value" value="2.61358816900267E-8" />
                    <cvParam cvRef="PSI-MS" accession="MS:1002319" name="Amanda:AmandaScore" value="75.8276284446026"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1001117" name="theoretical mass" value="2745.2672362621" unitCvRef="UO" unitAccession="UO:0000221" unitName="dalton"/>
                    <cvParam cvRef="PSI-MS" accession="MS:1002540" name="PeptideShaker PSM confidence type" value="Not Validated"/>
                </SpectrumIdentificationItem>
                <cvParam cvRef="PSI-MS" accession="MS:1000796" name="spectrum title" value="first_cycle:8431;last_cycle:8431;first_scan:86454;last_scan:86454;first_time:246.616;last_time:246.616;raw_file_identifier:OFMLP161222_11;"/>
                <cvParam cvRef="PSI-MS" accession="MS:1000894" name="retention time" value="14796.95" unitCvRef="UO" unitAccession="UO:0000010" unitName="second"/>
            </SpectrumIdentificationResult>
hbarsnes commented 7 years ago

Hi David,

Could you share the corresponding row in the Excel export as well so that we can compare?

The scores are only exported if they are larger than zero. As far as I can see that is the only limitation in the mzid export of these scores.

But if I remember correctly, the three phosphorylations are scored individually, and the one indicated above is PTM number 3 in the PTM list in the mzid file, which I would assume is phosphorylation of Y? Hence there are no other options for this PTM in the given sequence.

@mvaudel I hope you can provide a bit more background here? :)

Best regards, Harald

hbarsnes commented 7 years ago

Hi David,

We came across some errors in our export code where PTMs with the same modification mass resulted in only one of them being included in the mzid file. We're now testing a fix and will hopefully release a new version soon.

Best regards, Harald

mlocardpaulet commented 7 years ago

Hi, thanks for your help. Here it is for a spectrum where the phosphoRS score for the serine is reported (50%) and not the score of the threonine (50%). I paste the row from the PeptideShaker PSM report and the corresponding portion of .mzid.
Cheers, Marie (works with David).

In the report table:

RGSTPWGPAPPLHR first_cycle:6435;last_cycle:6435;first_scan:49926;last_scan:49926;first_time:151.422;last_time:151.422;raw_file_identifier:OFMLP161222_11; Phosphorylation of S (3: 50.0), Phosphorylation of T (4: 50.0) Phosphorylation of S (3: 0.425531914893617) 98.9821883 Confident

In the .mzid: ForHarald_20170607.txt

hbarsnes commented 7 years ago

Hi David, hi Marie,

Here's a beta version of PeptideShaker that should solve the problem: https://www.dropbox.com/s/xnh6xihamfuluuk/PeptideShaker-1.16.10-beta.zip?dl=0

Would be great if you could test it on your data and let us know?

Best regards, Harald

mlocardpaulet commented 7 years ago

Hi Harald,

sure, we'll do. I'll keep you posted.

Thanks again, Marie.

david-bouyssie commented 7 years ago

Thank you guys! You rocks!!!

mlocardpaulet commented 7 years ago

Hi,

we have tested an export with the beta version. All good.

Thanks, Marie.

hbarsnes commented 7 years ago

Hi Marie,

Great! Thanks for making us aware of this issue! We'll release a new official version as soon as possible.

Best regards, Harald

hbarsnes commented 7 years ago

Hi Marie, hi David,

We just released the official PeptideShaker v1.16.10, so I'm closing this issue. But please open a new one if you come across other problems with the mzid 1.2 export. As you can imagine, this part of the code has not been tested as thoroughly as the rest of the code, yet... :)

Best regards, Harald