MassBank / MassBank-data

Official repository of open data MassBank records
68 stars 55 forks source link

Validator ION_MODE not found #162

Open ksjewell opened 3 years ago

ksjewell commented 3 years ago

Greetings, I have the following error from Validator, which I do not understand:

Jewell@r:~/MassBankEurope/MassBank-data$ ./.scripts/validate.sh BfG_Koblenz/
Validator version: 2.1.8
07:12:39.175 ERROR massbank.cli.Validator - ION_MODE  expected
07:12:39.179 ERROR massbank.cli.Validator - AC$MASS_SPECTROMETRY: IONIZATION ESI
07:12:39.184 ERROR massbank.cli.Validator -                       ^
07:12:39.184 ERROR massbank.cli.Validator - Error in 'BfG_Koblenz/BFG00001.txt'.

This seems odd to me, since the file I want to upload contains the ION_MODE tag.

ACCESSION: BFG00001
RECORD_TITLE: Carbamazepine; LC-ESI-QTOF; MS2; CE=35 V +/- 15; [M+H]+
DATE: 2021.02.25
AUTHORS: Kevin Jewell, Björn Ehlig, Franziska Thron, Michael Schlüsener, Arne Wick, Aquatic Chemistry, Federal Institute of Hydrology (BfG)
LICENSE: CC BY NC SA
COPYRIGHT: Copyright (C) 2020, BfG, Koblenz, Germany
COMMENT: CONFIDENCE standard compound
COMMENT: BfG_01 111
CH$NAME: Carbamazepine
CH$COMPOUND_CLASS: N/A; Environmental Standard
CH$FORMULA: C15H12N2O
CH$EXACT_MASS: 236.095
CH$SMILES: NC(=O)N1c2ccccc2C=Cc3ccccc13
CH$IUPAC: InChI=1S/C15H12N2O/c16-15(18)17-13-7-3-1-5-11(13)9-10-12-6-2-4-8-14(12)17/h1-10H,(H2,16,18)
CH$LINK: CAS 298-46-4
CH$LINK: CHEBI 3387
CH$LINK: KEGG C06868
CH$LINK: PUBCHEM CID:2554
CH$LINK: INCHIKEY FFGPTBGBLSHEPO-UHFFFAOYSA-N
CH$LINK: CHEMSPIDER 2457
AC$INSTRUMENT: TripleTOF 5600 SCIEX
AC$INSTRUMENT_TYPE: LC-ESI-QTOF
AC$MASS_SPECTROMETRY: MS_TYPE MS2
AC$MASS_SPECTROMETRY: IONIZATION ESI
AC$MASS_SPECTROMETRY: ION_MODE POSITIVE
AC$MASS_SPECTROMETRY: FRAGMENTATION_MODE Q
AC$MASS_SPECTROMETRY: COLLISION_ENERGY 35 V +/- 15
AC$CHROMATOGRAPHY: COLUMN_NAME Zorbax Eclipse C18, 3.5 um, 150 mm x 2.1 mm, Agilent
AC$CHROMATOGRAPHY: FLOW_GRADIENT 98/2 at 0 min, 98/2 at 1 min, 80/20 at 2 min, 0/100 at 16.5 min, 0/100 at 22 min, 98/2 at 22.1 min, 98/2 at 27 min
AC$CHROMATOGRAPHY: FLOW_RATE 300 ul/min
AC$CHROMATOGRAPHY: RETENTION_TIME 8.870 min
AC$CHROMATOGRAPHY: SOLVENT A water with 0.1% formic acid
AC$CHROMATOGRAPHY: SOLVENT B acetonitrile with 0.1% formic acid
MS$FOCUSED_ION: BASE_PEAK 237.1024
MS$FOCUSED_ION: PRECURSOR_M/Z 237.1022
MS$FOCUSED_ION: PRECURSOR_TYPE [M+H]+
MS$DATA_PROCESSING: RECALIBRATE loess on assigned fragments and MS1
MS$DATA_PROCESSING: REANALYZE Peaks with additional N2/O included
MS$DATA_PROCESSING: WHOLE RMassBank 2.3.1
PK$SPLASH: splash10-0z00000000-b0610ffbd5df0b1b7a95
PK$ANNOTATION: m/z tentative_formula formula_count mass error(ppm)
  89.0405 C7H5+ 1 89.0386 21.98
  91.0548 C7H7+ 1 91.0542 6.74
  116.0482 C8H6N+ 3 116.0495 -10.95
  117.0562 C8H7N+ 3 117.0573 -9.45
  139.0545 C11H7+ 2 139.0542 2.3
  151.0543 C12H7+ 2 151.0542 0.38
  152.0613 C12H8+ 2 152.0621 -4.7
  164.0614 C13H8+ 2 164.0621 -4.23

Any ideas what I am not seeing?

Best regards, Kevin

tsufz commented 3 years ago

Good morning, The FRAGMENTATION_MODE Q is incorrect. The Q-TOF are the analysers doing the job of ion selection (Q) and registration (TOF). The collision cell is located between Q and TOF. There is nice video to show the principles of Q-TOF systems.

On a Q-TOF, the correct FRAGMENTATION_MODE is CID. See Record Format. The 5600 uses a high-energy CID, but we summarize all CID technologies as CID. The HCD fits only to Orbitraps as this is high-energy C-trap dissociation.

Best, Tobias

ksjewell commented 3 years ago

Okay thanks Tobias. I changed it, but the error still persists.

ACCESSION: BFG00001
RECORD_TITLE: Carbamazepine; LC-ESI-QTOF; MS2; CE=35 V +/- 15; [M+H]+
DATE: 2021.02.25
AUTHORS: Kevin Jewell, Björn Ehlig, Franziska Thron, Michael Schlüsener, Arne Wick, Aquatic Chemistry, Federal Institute of Hydrology (BfG)
LICENSE: CC BY NC SA
COPYRIGHT: Copyright (C) 2020, BfG, Koblenz, Germany
COMMENT: CONFIDENCE standard compound
COMMENT: BfG_01 111
CH$NAME: Carbamazepine
CH$COMPOUND_CLASS: N/A; Environmental Standard
CH$FORMULA: C15H12N2O
CH$EXACT_MASS: 236.095
CH$SMILES: NC(=O)N1c2ccccc2C=Cc3ccccc13
CH$IUPAC: InChI=1S/C15H12N2O/c16-15(18)17-13-7-3-1-5-11(13)9-10-12-6-2-4-8-14(12)17/h1-10H,(H2,16,18)
CH$LINK: CAS 298-46-4
CH$LINK: CHEBI 3387
CH$LINK: KEGG C06868
CH$LINK: PUBCHEM CID:2554
CH$LINK: INCHIKEY FFGPTBGBLSHEPO-UHFFFAOYSA-N
CH$LINK: CHEMSPIDER 2457
AC$INSTRUMENT: TripleTOF 5600 SCIEX
AC$INSTRUMENT_TYPE: LC-ESI-QTOF
AC$MASS_SPECTROMETRY: MS_TYPE MS2
AC$MASS_SPECTROMETRY: IONIZATION ESI
AC$MASS_SPECTROMETRY: ION_MODE POSITIVE
AC$MASS_SPECTROMETRY: FRAGMENTATION_MODE CID
AC$MASS_SPECTROMETRY: COLLISION_ENERGY 35 V +/- 15
AC$CHROMATOGRAPHY: COLUMN_NAME Zorbax Eclipse C18, 3.5 um, 150 mm x 2.1 mm, Agilent
AC$CHROMATOGRAPHY: FLOW_GRADIENT 98/2 at 0 min, 98/2 at 1 min, 80/20 at 2 min, 0/100 at 16.5 min, 0/100 at 22 min, 98/2 at 22.1 min, 98/2 at 27 min
AC$CHROMATOGRAPHY: FLOW_RATE 300 ul/min
AC$CHROMATOGRAPHY: RETENTION_TIME 8.870 min
AC$CHROMATOGRAPHY: SOLVENT A water with 0.1% formic acid
AC$CHROMATOGRAPHY: SOLVENT B acetonitrile with 0.1% formic acid
MS$FOCUSED_ION: BASE_PEAK 237.1024
MS$FOCUSED_ION: PRECURSOR_M/Z 237.1022
MS$FOCUSED_ION: PRECURSOR_TYPE [M+H]+
MS$DATA_PROCESSING: RECALIBRATE loess on assigned fragments and MS1
MS$DATA_PROCESSING: REANALYZE Peaks with additional N2/O included
MS$DATA_PROCESSING: WHOLE RMassBank 2.3.1
PK$SPLASH: splash10-0z00000000-b0610ffbd5df0b1b7a95
PK$ANNOTATION: m/z tentative_formula formula_count mass error(ppm)
  89.0405 C7H5+ 1 89.0386 21.98
  91.0548 C7H7+ 1 91.0542 6.74
  116.0482 C8H6N+ 3 116.0495 -10.95
  117.0562 C8H7N+ 3 117.0573 -9.45
  139.0545 C11H7+ 2 139.0542 2.3
  151.0543 C12H7+ 2 151.0542 0.38
  152.0613 C12H8+ 2 152.0621 -4.7
  164.0614 C13H8+ 2 164.0621 -4.23
  165.0697 C13H9+ 2 165.0699 -1.24
  166.0706 C12H8N+ 3 166.0651 33.13
  167.0757 C12H9N+ 3 167.073 16.64
  176.0607 C14H8+ 2 176.0621 -7.59
  177.0643 C14H9+ 3 177.0699 -31.34
  178.0683 C13H8N+ 3 178.0651 18.01
  179.0727 C13H9N+ 3 179.073 -1.2
  190.0648 C14H8N+ 3 190.0651 -1.73
  191.0725 C14H9N+ 3 191.073 -2.17
meowcat commented 3 years ago

The issue is that ION_MODE needs to come before IONIZATION; the validator is picky with entry order. The wrong order was a bug we had in an old version of RMassBank, when I didn't realize the order was so important. Can you try using the newest version?

schymane commented 3 years ago

This is what I thought too ... but I had not managed to check up in the specs yet ;-)

meier-rene commented 3 years ago

Hi, in the AC$MASS_SPECTROMETRY we have two mandatory subtags, MS_TYPE and ION_MODE, which need to be the first two entries. The two subtags have separate sections in the record spec (https://github.com/MassBank/MassBank-web/blob/main/Documentation/MassBankRecordFormat.md#243-acmass_spectrometry-ms_type) followed by the other subtags of this section. Its easier to code and organise things if the two mandatory fields come first. Do you think it needs to be changed or do I need to make the order of tags more clear in the rec spec document?

ksjewell commented 3 years ago

Thanks everyone for your help. I will try with the new version of Rmassbank. The reason we use the old one is because I made some changes to it and at that time I didn't know how to use git.

sneumann commented 3 years ago

Ah, which changes would that be ? Are they upstream now ? Would you like to open a separate issue to discuss how to get your requirements into RMassBank ? Yours, Steffen

ksjewell commented 3 years ago

Hi Steffen, No I didn't upstream the changes. I didn't know how. I will try the new version an open an issue there if necessary. Thanks, Kevin