glytoucan derived entries are confusing - WURCS string should not be treated as a synonym

cmungall commented 2 years ago

https://www.ebi.ac.uk/ols/ontologies/chebi/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCHEBI_146251

the WURCS string should be treated like inchi strings - don't overload synonym, use a new field.

(if you would like to collaborate on the IRIs for the properties I am working on a schema here: https://cmungall.github.io/chem-schema/wurcs_representation.html)

I also think the labels should be something other than "GlyTouCan IDENTIFIER", but I don't have any suggestions, other than not including these in CHEBI in the first place. It's not clear why a subset of 69 terms have been added.

Maybe provide some easy way for people to filter these computationally, to say they have not been fully curated

amalik01 commented 2 years ago

Last year, the Glygen database submitted around 8,000 glycans into ChEBI, some of these were partially defined glycans where you only know the composition of the glycan and the stereochemistry and connectivity is unknown. Therefore it was difficult to name such compounds, especially when the structure is not properly defined, therefore we decided to use the GlyTouCan identifiers as the ChEBI name for these structures.

The WURCS identifier was also provided as a synonym for these glycans since our current infrastructure does not allow WURCS to be added to CHEBI. This is something we will need to fix when we redevelop the ageing ChEBI infrastructure in the next few years.

cmungall commented 2 years ago

It should be a simple post-processing step to the owl to fix the WURCS ID - I can contribute code if you like

ebi-chebi / ChEBI

glytoucan derived entries are confusing - WURCS string should not be treated as a synonym #4152