MassBank / MassBank-data

Official repository of open data MassBank records
68 stars 55 forks source link

Some records with unclear ionization and precursor #164

Open meowcat opened 3 years ago

meowcat commented 3 years ago

A number of records have no MS$FOCUSED_ION: PRECURSOR_TYPE. At least a block of NAIST records have this issue.

For some it is quite clear that they're [M+H]+ or [M-H]-, for others an adduct can be extrapolated. For some, I haven't come up with an explanation. E.g. KNA00172:

https://massbank.eu/MassBank/RecordDisplay?id=KNA00172

Molecule mass is 181.0738, and precursor m/z is 284.10.

If my (still WIP) calculator is correct, this is none of the 113 adducts/ions specified in RMassBank:::getAdductInformation(""). It is also none of the adducts from Fiehn table https://fiehnlab.ucdavis.edu/images/files/software/ESI-MS-adducts-2020.xls .

A hypothetical [M+CF3CO2H+H]+ adduct would be at 284.074. But the authors claim formate, not TFA as a modifier.

Should I try to find and flag these records? Should I try and annotate the adducts where they can be inferred with some confidence?

(Note that I don't use the addition and charge from the RMassBank table, because I think base mass multiplication and charge multiplication isn't working properly in RMassBank right now. See https://github.com/MassBank/RMassBank/issues/284. Instead, I am parsing and processing the adductStrings, it does take into account stuff like [2M+3Na-4H]+ and the trivial name for ACN).

meowcat commented 3 years ago

Quick summary:

5 records Atsushi FFF: precursor type is in title only 45 records JEOL Ltd JEL: precursor type is in title only 438 records Takahashi KNA: just precursor m/z without ion information 53 records Maoka MSJ: using MS$FOCUSED_ION: ION_TYPE 121 records Parejo et al. PM...: using MS$FOCUSED_ION: ION_TYPE 261 records from RIKEN PR...: using MS$FOCUSED_ION: ION_TYPE 3604 records from RIKEN PS...: using MS$FOCUSED_ION: ION_TYPE 917 recrds from RIKEN PT...: using MS$FOCUSED_ION: ION_TYPE 4 recods SMI00106..00164 from CASMI 2012: just precursor m/z without ion information 45 records Tanaka TY...: precursor type is in title only

meowcat commented 3 years ago

In addition, there is also a significant number of records (3881) where MS$FOCUSED_ION: PRECURSOR_TYPE is correct, but MS$FOCUSED_ION: PRECURSOR_M/Z is not set:

takaakin commented 3 years ago

meowcat-san,

Thank you very much for your efforts to find and list the records that are lack of "MS$FOCUSED_ION: ION_TYPE" and other necessart terms. Those were submitted from Japanese contributors. I am very sorry that so many defect data were submitted to MassBank.

I would like to get a permission to fill the missing value from the corresponding authors. If I can neither contact with the authors nor find the values of the missing terms, I will ask you to remove these records from the MassBank repository.

Could you send me the list of Accession numbers and their missing values, if possible?

Sincerely yours,

Takaaki Nishioka


差出人: meowcat @.> 送信日時: 2021年3月30日 15:38 宛先: MassBank/MassBank-data @.> CC: Subscribed @.***> 件名: Re: [MassBank/MassBank-data] Some records with unclear ionization and precursor (#164)

Quick summary:

5 records Atsushi FFF: precursor type is in title only 45 records JEOL Ltd JEL: precursor type is in title only 438 records Takahashi KNA: just precursor m/z without ion information 53 records Maoka MSJ: using MS$FOCUSED_ION: ION_TYPE 121 records Parejo et al. PM...: using MS$FOCUSED_ION: ION_TYPE 261 records from RIKEN PR...: using MS$FOCUSED_ION: ION_TYPE 3604 records from RIKEN PS...: using MS$FOCUSED_ION: ION_TYPE 917 recrds from RIKEN PT...: using MS$FOCUSED_ION: ION_TYPE 4 recods SMI00106..00164 from CASMI 2012: just precursor m/z without ion information 45 records Tanaka TY...: precursor type is in title only

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/MassBank/MassBank-data/issues/164#issuecomment-809954395, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AED2RH4HISAAN6GSHNA4ALTTGFWU7ANCNFSM4Z7YXIUA.

tsufz commented 3 years ago

Dear Takaaki-san and all, I suggest a trial of automated curation of the records. If there is no reliable result of automated curation, we could try it manually. If this is not possible, the authors could check the records. I don't know the records in detail, but many precursors could be derived from context and checked by the masses. If we cannot assign reliable precursor ions, we will deprecate those records.

Best wishes, Tobias