biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
120 stars 53 forks source link

ClassyFire: consider making `CHEMONTID` the primary ID space #1237

Open cmungall opened 3 weeks ago

cmungall commented 3 weeks ago

Currently the CURIEs in bioregistry are classyfire:0004828

But the entries in the OBO file look like this:

[Term]
id: CHEMONTID:0004828
name: A-type proanthocyanidins
def: "Proanthocyanidins, which are characterized by the presence of two or more flavonoid units linked to each other by both a 2->7 and a 4->b (or 4->8) interflavonoid CC-bonds." [DOI:10.1039/b101061l, PMID:12430722]
synonym: "proanthocyanidin" BROAD ChEBI_TERM [CHEBI:26267]
xref: DOI:10.1039/b101061l "Karamali Khanbabaee and Teunis van Ree (2001). Tannins: Classification and Definition. Nat. Prod. Rep., 2001, 18, 641–649"
xref: PMID:12430722 "Ferreira D and Slade D (2002). Oligomeric proanthocyanidins: naturally occurring O-heterocycles. Nat Prod Rep., 19(5):517-41."
is_a: CHEMONTID:0000379 ! Proanthocyanidins
created_by: yandj
creation_date: 2015-10-31T18:10:28Z

The CURIE should look like the ID as far as possible, so this would point to CHEMONTID as primary

(yes including ID in the ID space is not ideal but this is what is there...)

However this brings up a governance issue. Does anyone get to come to bioregistry and bypass OBO and claim to be the ontology for X? This argues for sticking with classyfire....

An additional potential confusion here is the classyfire database includes both structures and classes

structures have URLs like: http://classyfire.wishartlab.com/entities/OSWPWNLULQWKAH-UHFFFAOYSA-N

Structures are not in the ontology. This would argue against my proposal for something like classyfire.chemontid vs classyfire.entity but IMO the most important this about classyfire is the classification

cthoyt commented 3 weeks ago

If I recall correctly, the Wishart group hadn’t posted an ontology artifact when this first ended up in the Bioregistry, which could explain the discrepancy.

However this brings up a governance issue. Does anyone get to come to bioregistry and bypass OBO and claim to be the ontology for X? This argues for sticking with classyfire....

I'm not quite sure what you're asking here, but there's a relevant discussion on #1212 related to the question about Bioregistry being prescriptive vs. descriptive.

structures have URLs like: http://classyfire.wishartlab.com/entities/OSWPWNLULQWKAH-UHFFFAOYSA-N

Structures are not in the ontology. This would argue against my proposal for something like classyfire.chemontid vs classyfire.entity but IMO the most important this about classyfire is the classification

Luckily, if they're referring to structures, this is an InChI key and we don't need to make a new prefix for that. See https://bioregistry.io/registry/inchikey