Open meowcat opened 3 years ago
Quick summary:
msms_spectrum
view, we can probably fix this in the view directly without changing the data. But it would probably help to specify a single tag to use in the record format.5 records Atsushi FFF: precursor type is in title only 45 records JEOL Ltd JEL: precursor type is in title only 438 records Takahashi KNA: just precursor m/z without ion information 53 records Maoka MSJ: using MS$FOCUSED_ION: ION_TYPE 121 records Parejo et al. PM...: using MS$FOCUSED_ION: ION_TYPE 261 records from RIKEN PR...: using MS$FOCUSED_ION: ION_TYPE 3604 records from RIKEN PS...: using MS$FOCUSED_ION: ION_TYPE 917 recrds from RIKEN PT...: using MS$FOCUSED_ION: ION_TYPE 4 recods SMI00106..00164 from CASMI 2012: just precursor m/z without ion information 45 records Tanaka TY...: precursor type is in title only
In addition, there is also a significant number of records (3881) where MS$FOCUSED_ION: PRECURSOR_TYPE
is correct, but MS$FOCUSED_ION: PRECURSOR_M/Z
is not set:
MS$FOCUSED_ION: FULL_SCAN_FRAGMENT_ION_PEAK
meowcat-san,
Thank you very much for your efforts to find and list the records that are lack of "MS$FOCUSED_ION: ION_TYPE" and other necessart terms. Those were submitted from Japanese contributors. I am very sorry that so many defect data were submitted to MassBank.
I would like to get a permission to fill the missing value from the corresponding authors. If I can neither contact with the authors nor find the values of the missing terms, I will ask you to remove these records from the MassBank repository.
Could you send me the list of Accession numbers and their missing values, if possible?
Sincerely yours,
Takaaki Nishioka
差出人: meowcat @.> 送信日時: 2021年3月30日 15:38 宛先: MassBank/MassBank-data @.> CC: Subscribed @.***> 件名: Re: [MassBank/MassBank-data] Some records with unclear ionization and precursor (#164)
Quick summary:
5 records Atsushi FFF: precursor type is in title only 45 records JEOL Ltd JEL: precursor type is in title only 438 records Takahashi KNA: just precursor m/z without ion information 53 records Maoka MSJ: using MS$FOCUSED_ION: ION_TYPE 121 records Parejo et al. PM...: using MS$FOCUSED_ION: ION_TYPE 261 records from RIKEN PR...: using MS$FOCUSED_ION: ION_TYPE 3604 records from RIKEN PS...: using MS$FOCUSED_ION: ION_TYPE 917 recrds from RIKEN PT...: using MS$FOCUSED_ION: ION_TYPE 4 recods SMI00106..00164 from CASMI 2012: just precursor m/z without ion information 45 records Tanaka TY...: precursor type is in title only
- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/MassBank/MassBank-data/issues/164#issuecomment-809954395, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AED2RH4HISAAN6GSHNA4ALTTGFWU7ANCNFSM4Z7YXIUA.
Dear Takaaki-san and all, I suggest a trial of automated curation of the records. If there is no reliable result of automated curation, we could try it manually. If this is not possible, the authors could check the records. I don't know the records in detail, but many precursors could be derived from context and checked by the masses. If we cannot assign reliable precursor ions, we will deprecate those records.
Best wishes, Tobias
A number of records have no
MS$FOCUSED_ION: PRECURSOR_TYPE
. At least a block of NAIST records have this issue.For some it is quite clear that they're [M+H]+ or [M-H]-, for others an adduct can be extrapolated. For some, I haven't come up with an explanation. E.g. KNA00172:
https://massbank.eu/MassBank/RecordDisplay?id=KNA00172
Molecule mass is 181.0738, and precursor m/z is 284.10.
If my (still WIP) calculator is correct, this is none of the 113 adducts/ions specified in
RMassBank:::getAdductInformation("")
. It is also none of the adducts from Fiehn table https://fiehnlab.ucdavis.edu/images/files/software/ESI-MS-adducts-2020.xls .A hypothetical
[M+CF3CO2H+H]+
adduct would be at 284.074. But the authors claim formate, not TFA as a modifier.Should I try to find and flag these records? Should I try and annotate the adducts where they can be inferred with some confidence?
(Note that I don't use the
addition
andcharge
from the RMassBank table, because I think base mass multiplication and charge multiplication isn't working properly in RMassBank right now. See https://github.com/MassBank/RMassBank/issues/284. Instead, I am parsing and processing theadductString
s, it does take into account stuff like[2M+3Na-4H]+
and the trivial name forACN
).