For all the examples the actual intend of the authors seems obvious (for (1) "CH$LINK: INCHIKEY value, for (2) "PK$ANNOTATION", for (3) "COMMENT"), therefore I would suggest to change these obvious mistakes in the record files.
General
The MassBankRecord Format does not explicitly state the format of tags. I would suggest to agree on a format and to add it to the specification (ideally with a regular expression). I would suggest the following format:
^[A-Z]+(_[A-Z]+)*(\$[A-Z]+(_[A-Z]+)*)?$
Each Tag consists of an optional prefix and a mandatory suffix. If there are prefix and suffix they are separated by "$". Suffix and can be made up of multiple capital letter words separated by "_".
legal examples:
X (single word suffix)
XX (single word suffix)
X_X (two word suffix)
X_X_X (three word suffix)
X$X (one word prefix, one word suffix)
X_X$X (two word prefix, one word suffix)
X$X_X (one word prefix, two word suffix)
X_X$X_X (two word prefix, two word suffix)
...
Some records use "malformed" tags.
Examples:
Specific
For all the examples the actual intend of the authors seems obvious (for (1) "CH$LINK: INCHIKEY value, for (2) "PK$ANNOTATION", for (3) "COMMENT"), therefore I would suggest to change these obvious mistakes in the record files.
General
The MassBankRecord Format does not explicitly state the format of tags. I would suggest to agree on a format and to add it to the specification (ideally with a regular expression). I would suggest the following format:
^[A-Z]+(_[A-Z]+)*(\$[A-Z]+(_[A-Z]+)*)?$
Each Tag consists of an optional prefix and a mandatory suffix. If there are prefix and suffix they are separated by "$". Suffix and can be made up of multiple capital letter words separated by "_".
legal examples:
X (single word suffix) XX (single word suffix) X_X (two word suffix) X_X_X (three word suffix) X$X (one word prefix, one word suffix) X_X$X (two word prefix, one word suffix) X$X_X (one word prefix, two word suffix) X_X$X_X (two word prefix, two word suffix) ...