LucaCappelletti94 / mascot-rs

A rust package to validate and process Mascot Generic Format (MGF) for fragmentation spectra.
MIT License
1 stars 0 forks source link

Absence of second level fragmentation in Sirius document #2

Open LucaCappelletti94 opened 1 year ago

LucaCappelletti94 commented 1 year ago

A second level fragmentation is missing in the Sirius document called mapp_batch_000052_sirius.mgf. Even more peculiar, in the document with the same number (000052), but not the Sirius version, we find that there seems to be a second level fragmentation.

Is this an error? @oolonek

oolonek commented 1 year ago

Can you point me to the feature in question ?

LucaCappelletti94 commented 1 year ago

Feature 740 and 741

LucaCappelletti94 commented 1 year ago

The bit of the file of interest is the following:


BEGIN IONS
FEATURE_ID=740
PEPMASS=71.02918243408203
CHARGE=2
RTINSECONDS=42.98
SPECTYPE=CORRELATED MS
MSLEVEL=1
FILENAME=Pool_K+_GB7_01_19912_modified.mzXML;Pool_all+_RA2_01_19811_modified.mzXML;Pool_all+_RA2_01_19746_modified.mzXML;S2C3+_GA6_01_19831_modified.mzXML;C2M5+_BB4_01_19905_modified.mzXML;Pool_all+_RA2_01_19853_modified.mzXML;C2M2+_RE5_01_19810_modified.mzXML;A2M2+_RD1_01_19792_modified.mzXML;C2M1+_RC5_01_19782_modified.mzXML;M2M2+_RD8_01_19805_modified.mzXML;Pool_all+_RA2_01_19797_modified.mzXML;Pool_C+_BA4_01_19872_modified.mzXML;Ccont2+_RE7_01_19818_modified.mzXML;Pool_C+_BA4_01_19774_modified.mzXML;Pool_S+_RA4_01_19910_modified.mzXML;A2M1+_RA7_01_19762_modified.mzXML;A2M5+_GE6_01_19880_modified.mzXML;S2A2+_RD4_01_19795_modified.mzXML;Pool_S+_RA4_01_19827_modified.mzXML;A2A2+_RC8_01_19791_modified.mzXML;C2C3+_GC1_01_19847_modified.mzXML;S2A3+_GA4_01_19823_modified.mzXML;S2A5+_BA1_01_19889_modified.mzXML;Pool_C+_BA4_01_19830_modified.mzXML;Pool_all+_RA2_01_19839_modified.mzXML;Pool_all+_RA2_01_19895_modified.mzXML;M2C4+_GD4_01_19864_modified.mzXML;Mcont4+_GD5_01_19865_modified.mzXML;Mcont2+_GA7_01_19832_modified.mzXML;Pool_M+_RA5_01_19814_modified.mzXML;K2A5+_BA8_01_19901_modified.mzXML;Pool_K+_GB7_01_19829_modified.mzXML;K2M5+_BB1_01_19902_modified.mzXML;S2C1+_RB4_01_19767_modified.mzXML;Pool_C+_BA4_01_19913_modified.mzXML;S2A1+_RB2_01_19765_modified.mzXML;A2M4+_GC4_01_19850_modified.mzXML;Pool_M+_RA5_01_19842_modified.mzXML;Pool_A+_RA3_01_19756_modified.mzXML;S2C4+_GD1_01_19861_modified.mzXML;Pool_A+_RA3_01_19812_modified.mzXML;M2A2+_RD7_01_19804_modified.mzXML;Pool_M+_RA5_01_19898_modified.mzXML;Ccont4+_GE4_01_19878_modified.mzXML;Pool_K+_GB7_01_19885_modified.mzXML;Pool_A+_RA3_01_19784_modified.mzXML;M2A4+_GD2_01_19862_modified.mzXML;Pool_K+_GB7_01_19857_modified.mzXML;Pool_M+_RA5_01_19870_modified.mzXML;Pool_A+_RA3_01_19896_modified.mzXML;Pool_S+_RA4_01_19841_modified.mzXML;Pool_A+_RA3_01_19840_modified.mzXML;Pool_K+_GB7_01_19801_modified.mzXML;Pool_S+_RA4_01_19785_modified.mzXML;Pool_K+_GB7_01_19787_modified.mzXML;S2C2+_RD6_01_19803_modified.mzXML;A2A4+_GC3_01_19849_modified.mzXML;C2A1+_RC4_01_19781_modified.mzXML;A2C1+_RA8_01_19763_modified.mzXML;bLANK MeOH_Eau+_85_15+_RA1_01_19914_modified.mzXML;Pool_K+_GB7_01_19759_modified.mzXML;Pool_C+_BA4_01_19802_modified.mzXML;M2C1+_RB7_01_19776_modified.mzXML;C2C5+_BB5_01_19906_modified.mzXML;C2M3+_GB8_01_19846_modified.mzXML;K2A2+_RE1_01_19806_modified.mzXML;A2M3+_GA1_01_19820_modified.mzXML;Pool_S+_RA4_01_19897_modified.mzXML;S2M2+_RD5_01_19796_modified.mzXML;Pool_all+_RA2_01_19881_modified.mzXML;M2M1+_RB6_01_19775_modified.mzXML;Pool_S+_RA4_01_19869_modified.mzXML;C2C4+_GE3_01_19877_modified.mzXML;Pool_all+_RA2_01_19755_modified.mzXML;Acont1+_RB1_01_19764_modified.mzXML;Pool_S+_RA4_01_19813_modified.mzXML;Mcont3+_GB3_01_19836_modified.mzXML;A2C5+_GE7_01_19887_modified.mzXML;Pool_A+_RA3_01_19868_modified.mzXML;Pool_C+_BA4_01_19858_modified.mzXML;Pool_M+_RA5_01_19911_modified.mzXML;A2C3+_GA2_01_19821_modified.mzXML;K2A3+_GB4_01_19837_modified.mzXML;Cont5+_BB6_01_19907_modified.mzXML;C2C2+_RE6_01_19817_modified.mzXML;Acont5+_GE8_01_19888_modified.mzXML;C2M4+_GE2_01_19876_modified.mzXML;Pool_C+_BA4_01_19760_modified.mzXML;Acont4+_GC6_01_19852_modified.mzXML;M2M4+_GD3_01_19863_modified.mzXML;Acont3+_GA3_01_19822_modified.mzXML;Pool_A+_RA3_01_19909_modified.mzXML;K2C5+_BB2_01_19903_modified.mzXML;M2C2+_GD6_01_19866_modified.mzXML;C2C1+_RC6_01_19789_modified.mzXML;Mcont5+_BA7_01_19894_modified.mzXML;M2C5+_BA6_01_19893_modified.mzXML;Pool_all+_RA2_01_19783_modified.mzXML;Pool_C+_BA4_01_19886_modified.mzXML;Pool_A+_RA3_01_19882_modified.mzXML;A2C2+_RD2_01_19793_modified.mzXML;Pool_S+_RA4_01_19799_modified.mzXML;Pool_M+_RA5_01_19828_modified.mzXML;K2M1+_RC2_01_19779_modified.mzXML;Ccont3+_GC2_01_19848_modified.mzXML;Pool_C+_BA4_01_19900_modified.mzXML;K2M2+_RE2_01_19807_modified.mzXML;Pool_C+_BA4_01_19788_modified.mzXML;K2C2+_RE3_01_19808_modified.mzXML;A2A3+_RE8_01_19819_modified.mzXML;S2C5+_BA3_01_19891_modified.mzXML;Pool_all+_RA2_01_19825_modified.mzXML;K2C3+_GB6_01_19845_modified.mzXML;Pool_S+_RA4_01_19757_modified.mzXML;C2A2+_RE4_01_19809_modified.mzXML;K2A1+_RC1_01_19778_modified.mzXML;Pool_K+_GB7_01_19815_modified.mzXML;S2M5+_BA2_01_19890_modified.mzXML;A2C4+_GC5_01_19851_modified.mzXML;Pool_all+_RA2_01_19867_modified.mzXML;Pool_K+_GB7_01_19773_modified.mzXML;Pool_M+_RA5_01_19772_modified.mzXML;M2C3+_GB2_01_19835_modified.mzXML;Pool_K+_GB7_01_19843_modified.mzXML;Pool_M+_RA5_01_19786_modified.mzXML;Pool_A+_RA3_01_19854_modified.mzXML;BLANK MeOH_Eau+_85_15_RA1_01_19745_modified.mzXML;Pool_A+_RA3_01_19798_modified.mzXML;S2M3+_GA5_01_19824_modified.mzXML;S2A4+_GC7_01_19859_modified.mzXML;K2M3+_GB5_01_19838_modified.mzXML;Pool_M+_RA5_01_19856_modified.mzXML;S2M1+_RB3_01_19766_modified.mzXML;Pool_S+_RA4_01_19771_modified.mzXML;Pool_M+_RA5_01_19800_modified.mzXML;S2M4+_GC8_01_19860_modified.mzXML;Pool_S+_RA4_01_19883_modified.mzXML;C2A5+_BB3_01_19904_modified.mzXML;K2M4+_GD7_01_19873_modified.mzXML;C2A4+_GE1_01_19875_modified.mzXML;M2M5+_BA5_01_19892_modified.mzXML;Acont2+_RD3_01_19794_modified.mzXML;Pool_all+_RA2_01_19908_modified.mzXML;Pool_K+_GB7_01_19899_modified.mzXML;Pool_S+_RA4_01_19855_modified.mzXML;Mcont1+_RB8_01_19777_modified.mzXML;Pool_A+_RA3_01_19770_modified.mzXML;Pool_M+_RA5_01_19758_modified.mzXML;K2C4+_GD8_01_19874_modified.mzXML;K2C1+_RC3_01_19780_modified.mzXML;M2A3+_GA8_01_19833_modified.mzXML;Pool_C+_BA4_01_19844_modified.mzXML;Pool_C+_BA4_01_19816_modified.mzXML;Ccont1+_RC7_01_19790_modified.mzXML;Pool_all+_RA2_01_19769_modified.mzXML;Pool_K+_GB7_01_19871_modified.mzXML;M2A1+_RB5_01_19768_modified.mzXML;A2A1+_RA6_01_19761_modified.mzXML;Pool_A+_RA3_01_19826_modified.mzXML;M2M3+_GB1_01_19834_modified.mzXML;Pool_M+_RA5_01_19884_modified.mzXML;A2A5+_GE5_01_19879_modified.mzXML
SCANS=-1
71.02918243408203 100.0
END IONS

BEGIN IONS
FEATURE_ID=741
PEPMASS=293.1227986810843
CHARGE=1
RTINSECONDS=139.853
SPECTYPE=CORRELATED MS
MSLEVEL=1
FILENAME=Pool_K+_GB7_01_19912_modified.mzXML;Pool_all+_RA2_01_19811_modified.mzXML;Pool_all+_RA2_01_19746_modified.mzXML;S2C3+_GA6_01_19831_modified.mzXML;C2M5+_BB4_01_19905_modified.mzXML;Pool_all+_RA2_01_19853_modified.mzXML;C2M2+_RE5_01_19810_modified.mzXML;A2M2+_RD1_01_19792_modified.mzXML;C2M1+_RC5_01_19782_modified.mzXML;M2M2+_RD8_01_19805_modified.mzXML;Pool_all+_RA2_01_19797_modified.mzXML;Pool_C+_BA4_01_19872_modified.mzXML;Ccont2+_RE7_01_19818_modified.mzXML;Pool_C+_BA4_01_19774_modified.mzXML;Pool_S+_RA4_01_19910_modified.mzXML;A2M1+_RA7_01_19762_modified.mzXML;A2M5+_GE6_01_19880_modified.mzXML;S2A2+_RD4_01_19795_modified.mzXML;Pool_S+_RA4_01_19827_modified.mzXML;A2A2+_RC8_01_19791_modified.mzXML;S2A3+_GA4_01_19823_modified.mzXML;S2A5+_BA1_01_19889_modified.mzXML;Pool_C+_BA4_01_19830_modified.mzXML;Pool_all+_RA2_01_19839_modified.mzXML;Pool_all+_RA2_01_19895_modified.mzXML;M2C4+_GD4_01_19864_modified.mzXML;Mcont4+_GD5_01_19865_modified.mzXML;Mcont2+_GA7_01_19832_modified.mzXML;Pool_M+_RA5_01_19814_modified.mzXML;K2A5+_BA8_01_19901_modified.mzXML;Pool_K+_GB7_01_19829_modified.mzXML;K2M5+_BB1_01_19902_modified.mzXML;S2C1+_RB4_01_19767_modified.mzXML;Pool_C+_BA4_01_19913_modified.mzXML;S2A1+_RB2_01_19765_modified.mzXML;A2M4+_GC4_01_19850_modified.mzXML;Pool_M+_RA5_01_19842_modified.mzXML;Pool_A+_RA3_01_19756_modified.mzXML;S2C4+_GD1_01_19861_modified.mzXML;Pool_A+_RA3_01_19812_modified.mzXML;M2A2+_RD7_01_19804_modified.mzXML;Pool_M+_RA5_01_19898_modified.mzXML;Ccont4+_GE4_01_19878_modified.mzXML;Pool_K+_GB7_01_19885_modified.mzXML;Pool_A+_RA3_01_19784_modified.mzXML;M2A4+_GD2_01_19862_modified.mzXML;Pool_K+_GB7_01_19857_modified.mzXML;Pool_M+_RA5_01_19870_modified.mzXML;Pool_A+_RA3_01_19896_modified.mzXML;Pool_S+_RA4_01_19841_modified.mzXML;Pool_A+_RA3_01_19840_modified.mzXML;Pool_K+_GB7_01_19801_modified.mzXML;Pool_S+_RA4_01_19785_modified.mzXML;Pool_K+_GB7_01_19787_modified.mzXML;S2C2+_RD6_01_19803_modified.mzXML;A2A4+_GC3_01_19849_modified.mzXML;C2A1+_RC4_01_19781_modified.mzXML;A2C1+_RA8_01_19763_modified.mzXML;bLANK MeOH_Eau+_85_15+_RA1_01_19914_modified.mzXML;Pool_K+_GB7_01_19759_modified.mzXML;Pool_C+_BA4_01_19802_modified.mzXML;M2C1+_RB7_01_19776_modified.mzXML;C2C5+_BB5_01_19906_modified.mzXML;C2M3+_GB8_01_19846_modified.mzXML;K2A2+_RE1_01_19806_modified.mzXML;A2M3+_GA1_01_19820_modified.mzXML;Pool_S+_RA4_01_19897_modified.mzXML;S2M2+_RD5_01_19796_modified.mzXML;Pool_all+_RA2_01_19881_modified.mzXML;M2M1+_RB6_01_19775_modified.mzXML;Pool_S+_RA4_01_19869_modified.mzXML;C2C4+_GE3_01_19877_modified.mzXML;Pool_all+_RA2_01_19755_modified.mzXML;Acont1+_RB1_01_19764_modified.mzXML;Pool_S+_RA4_01_19813_modified.mzXML;Mcont3+_GB3_01_19836_modified.mzXML;A2C5+_GE7_01_19887_modified.mzXML;Pool_A+_RA3_01_19868_modified.mzXML;Pool_C+_BA4_01_19858_modified.mzXML;Pool_M+_RA5_01_19911_modified.mzXML;A2C3+_GA2_01_19821_modified.mzXML;K2A3+_GB4_01_19837_modified.mzXML;Cont5+_BB6_01_19907_modified.mzXML;C2C2+_RE6_01_19817_modified.mzXML;Acont5+_GE8_01_19888_modified.mzXML;C2M4+_GE2_01_19876_modified.mzXML;Pool_C+_BA4_01_19760_modified.mzXML;Acont4+_GC6_01_19852_modified.mzXML;M2M4+_GD3_01_19863_modified.mzXML;Acont3+_GA3_01_19822_modified.mzXML;Pool_A+_RA3_01_19909_modified.mzXML;K2C5+_BB2_01_19903_modified.mzXML;M2C2+_GD6_01_19866_modified.mzXML;C2C1+_RC6_01_19789_modified.mzXML;Mcont5+_BA7_01_19894_modified.mzXML;M2C5+_BA6_01_19893_modified.mzXML;Pool_all+_RA2_01_19783_modified.mzXML;Pool_C+_BA4_01_19886_modified.mzXML;Pool_A+_RA3_01_19882_modified.mzXML;A2C2+_RD2_01_19793_modified.mzXML;Pool_S+_RA4_01_19799_modified.mzXML;Pool_M+_RA5_01_19828_modified.mzXML;K2M1+_RC2_01_19779_modified.mzXML;Ccont3+_GC2_01_19848_modified.mzXML;Pool_C+_BA4_01_19900_modified.mzXML;K2M2+_RE2_01_19807_modified.mzXML;Pool_C+_BA4_01_19788_modified.mzXML;K2C2+_RE3_01_19808_modified.mzXML;A2A3+_RE8_01_19819_modified.mzXML;S2C5+_BA3_01_19891_modified.mzXML;Pool_all+_RA2_01_19825_modified.mzXML;K2C3+_GB6_01_19845_modified.mzXML;Pool_S+_RA4_01_19757_modified.mzXML;C2A2+_RE4_01_19809_modified.mzXML;K2A1+_RC1_01_19778_modified.mzXML;Pool_K+_GB7_01_19815_modified.mzXML;S2M5+_BA2_01_19890_modified.mzXML;A2C4+_GC5_01_19851_modified.mzXML;Pool_all+_RA2_01_19867_modified.mzXML;Pool_K+_GB7_01_19773_modified.mzXML;Pool_M+_RA5_01_19772_modified.mzXML;M2C3+_GB2_01_19835_modified.mzXML;Pool_K+_GB7_01_19843_modified.mzXML;Pool_M+_RA5_01_19786_modified.mzXML;Pool_A+_RA3_01_19854_modified.mzXML;BLANK MeOH_Eau+_85_15_RA1_01_19745_modified.mzXML;Pool_A+_RA3_01_19798_modified.mzXML;S2M3+_GA5_01_19824_modified.mzXML;S2A4+_GC7_01_19859_modified.mzXML;K2M3+_GB5_01_19838_modified.mzXML;Pool_M+_RA5_01_19856_modified.mzXML;S2M1+_RB3_01_19766_modified.mzXML;Pool_S+_RA4_01_19771_modified.mzXML;Pool_M+_RA5_01_19800_modified.mzXML;S2M4+_GC8_01_19860_modified.mzXML;Pool_S+_RA4_01_19883_modified.mzXML;C2A5+_BB3_01_19904_modified.mzXML;K2M4+_GD7_01_19873_modified.mzXML;C2A4+_GE1_01_19875_modified.mzXML;M2M5+_BA5_01_19892_modified.mzXML;Acont2+_RD3_01_19794_modified.mzXML;Pool_all+_RA2_01_19908_modified.mzXML;Pool_K+_GB7_01_19899_modified.mzXML;Pool_S+_RA4_01_19855_modified.mzXML;Mcont1+_RB8_01_19777_modified.mzXML;Pool_A+_RA3_01_19770_modified.mzXML;Pool_M+_RA5_01_19758_modified.mzXML;K2C4+_GD8_01_19874_modified.mzXML;K2C1+_RC3_01_19780_modified.mzXML;M2A3+_GA8_01_19833_modified.mzXML;Pool_C+_BA4_01_19844_modified.mzXML;Pool_C+_BA4_01_19816_modified.mzXML;Ccont1+_RC7_01_19790_modified.mzXML;Pool_all+_RA2_01_19769_modified.mzXML;Pool_K+_GB7_01_19871_modified.mzXML;M2A1+_RB5_01_19768_modified.mzXML;A2A1+_RA6_01_19761_modified.mzXML;Pool_A+_RA3_01_19826_modified.mzXML;M2M3+_GB1_01_19834_modified.mzXML;Pool_M+_RA5_01_19884_modified.mzXML;A2A5+_GE5_01_19879_modified.mzXML
SCANS=-1
293.1227986810843 6.5E3
294.12558475611837 1.0E3
END IONS

BEGIN IONS
FEATURE_ID=741
PEPMASS=293.1227986810843
CHARGE=1
RTINSECONDS=139.853
MSLEVEL=2
FILENAME=M2M2+_RD8_01_19805_modified.mzXML
SCANS=741
MERGED_SCANS=551,555
MERGED_STATS=2 / 2 (0 removed due to low quality, 0 removed due to low cosine).
57.072547912597656 5.8E1
69.03232138497489 1.3E2
69.06891632080078 4.2E1
71.01319122314453 6.8E1
71.04978783450909 5.4E2
72.31999969482422 3.6E1
79.04064178466797 3.6E1
80.94597625732422 3.4E1
81.0339126586914 1.3E2
85.02936436012747 4.1E2
85.05850219726562 2.6E1
85.06507873535156 2.8E1
86.05908966064453 8.6E1
88.93335723876953 2.6E1
89.02365447253716 4.1E2
90.02677154541016 4.0E1
90.94739532470703 4.8E1
93.07049560546875 1.3E2
95.04878288929856 2.5E2
95.08738827705383 6.4E1
95.9764404296875 9.4E1
96.04037475585938 6.2E1
97.02774762092753 2.8E2
97.10071563720703 2.6E1
97.59175109863281 6.8E1
103.0386962890625 5.8E1
106.14248657226562 4.4E1
107.06947326660156 9.6E1
108.95759582519531 2.8E1
109.03044891357422 5.6E1
113.01676177978516 2.6E1
113.0601105247439 2.9E3
113.07242584228516 6.8E1
113.09166717529297 4.4E1
113.51309967041016 3.2E1
113.98345947265625 2.8E1
114.06482517408288 2.3E2
oolonek commented 1 year ago

I just checked and this happened in this case because the spectra in question didn't pass the parameters specified in the Merge ms/ms function (MzMine Sirius export module). If the peaklist is exported without checking the Merge ms/ms then the spectra at the MSMS level is present. In this case the FEATURE_ID=740 should be dropped of the treated mgf. In general, for Sirius type mgf, I think that any FEATURE_ID=X, MSLEVEL=1 without associated FEATURE_ID=X, MSLEVEL=12 should be dropped.