MassBank / MassBank-web

The web server application and directly connected components for a MassBank web server
14 stars 22 forks source link

Refinement of CH$COMPOUND_CLASS #87

Open Treutler opened 7 years ago

Treutler commented 7 years ago

CH$COMPOUND_CLASS should be refined. My proposal is as follows:

E.g.: ChEBI CHEBI:28527 KEGG Flavonols ChemOnt CHEMONTID:0003531 ChemOnt Organic compounds / Phenylpropanoids and polyketides / Flavonoids / Flavonoid glycosides / Flavonoid O-glycosides / Flavonoid-3-O-glycosides

schymane commented 7 years ago

I agree that the Natural/Non-Natural product is not ideal. This is almost impossible to classify automatically and e.g. our environmental standards are certainly not all “non-natural” – hence we set all ours to “NA”. There are some good new methods around, that I have not yet tested extensively to see how these may work on our datasets. However, the definition of this field is set quite clearly in the record format – and the current specifications do allow multiple entries after this first field, see: 2.2.2 CH$COMPOUND_CLASS * ・ Category of Chemical Compound. Mandatory ・ Example CH$COMPOUND_CLASS: Natural Product; Carotenoid; Terpenoid; Lipid ・ Either Natural Product or Non-Natural Product should be precedes the other class names .

I feel we will either have to enhance the CH$COMPOUND_CLASS field in a way that is compatible with the current specifications, or consider introducing a new field, e.g. CH$ONTOLOGY or similar? @m-arita has a valid point regarding overwriting in #81 – here we will also risk overwriting original knowledge if we are not careful and this is dangerous.