Closed matentzn closed 2 years ago
The Bioregistry already has two related prefixes, that are not the same:
ordo
is already tagged as a synonym of orphanet.ordo
.
It looks like this request is mixing parts of each, using the prefix from the main vocabulary but the provider format URL for the ORDO vocabulary. I'm not really sure where the confusion comes from, but maybe it's a systematic mistake propagated from the Biolink Model's Biocontext (see the "Metaregistry" heading in http://bioregistry.io/registry/orphanet.ordo). I manually curated the mappings from that resource and noted that they were mix and matching before
I dont so much care about the prefix here, but the URL prefix needs to be added somewhere so I can run conversions correctly!
I mentioned in that other issue (https://github.com/mapping-commons/sssom-py/issues/161) that I recently actually added this URI prefix in biopragmatics/bioregistry@21906c3. Is that sufficient?
It is! Thats the important bit.
The Bioregistry already has two related prefixes, that are not the same
I'm confused about the difference between ORPHA and ORDO terms, so coming here for help. And possibly finding some disambiguating description that we can add to Bioregistry to help users.
On ORDO, there's a 2014 publication, only on ResearchGate, titled "ORDO: An Ontology Connecting Rare Disease, Epidemiology and Genetic Data". My understanding of this paper is that the authors took Orphanet and converted it to an OWL ontology. I don't find any discussion of whether the identifiers carry over or new identifiers are assigned. The https://github.com/Orphanet/Orpha2Ordo repo mentioned in the paper is gone.
The official page for ORDO appears to be at https://www.orphadata.com/ordo/. Any more insights into how Orphanet (ORPHA) terms are difference from ORDO terms? Does every ORDO term have a corresponding Orphanet term, but with different identifiers?
I arrived here when we noticed EFO has some cross-references to orphanet
and some to orphanet.ordo
post Bioregistry normalization.
It is my understanding that the identifier space for the two is one and the same. I cant say for certain that there are not some ids in Orphanet that are not in ORDO, but the last five years I am working under the assumption that the semantic space is one and the same. I will check with the Monarch folks and see if I find out more.
Please take a look at the terms that are prefixed with a C
- these appear to work in ordo but not in Orphanet (https://www.ebi.ac.uk/ols/ontologies/ordo/terms?iri=http://www.orpha.net/ORDO/Orphanet_C023)
There's definitely two semantic spaces here, with distinct patterns. The resources that can resolve one or both are hard to understand. This feels reminiscent of the omim
vs omim.ps
discussion
So orphanet.ordo:C023
for "age of onset" exists only in ODRO because it's a property key that can be applied to diseases in Orphanet, but is not a disease itself?
Identifiers do appear shared between both. For example, both the following will get to you to "Rare dyslipidemia":
Is the identifier set for orphanet.ordo
a strict superset of the identifier set of orphanet
? @Orphanet / Marc Hanauer any guidance you could provide here would be much appreciated.
With respect to bioregistry, we use it to normalize prefixes to enable more comprehensive mapping between resources. I see why having two separate Orphanet namespaces is technically correct here, but it would be good to have a recommendation to improve data linking. For example, perhaps all orphanet IDs should be converted to orphanet.ordo
.
Looking at EFO OTAR Slim v3.57.0 (a version of EFO with a disease focus), we observe the following counts of cross-references (xrefs after bioregistry normalization):
orphanet
orphanet.ordo
All of the 55 orphanet.ordo
xrefs appear to be valid orphanet
IDs (table below). So for our use case, I'm tempted to convert these all to orphanet
prefixes. Note that some EFO terms imported from Orphanet like Orphanet:107
actually xref themselves via orphanet.ordo:98702
.
Another option would be to convert all orphanet
prefixes to orphanet.ordo
. However, it looks like EFO uses Orphanet
and not ORDO
as the prefix for the terms that they include from ORDO/Orphanet.
Prefix
Orphanet
Name
Orphanet
Homepage
https://www.orpha.net
Description
Orphanet is a unique resource, gathering and improving knowledge on rare diseases so as to improve the diagnosis, care and treatment of patients with rare diseases.
Example Identifier
79154
Regular Expression Pattern
[A-Z0-9][0-9]+
Redundant Prefix in Regular Expression Pattern
No response
Provider Format URL
http://www.orpha.net/ORDO/Orphanet_
Contributor Name
Nico Matentzoglu
Contributor ORCiD
0000-0002-7356-1779
Additional Comments
There is already a lowercase
orphanet
namespace but it uses the wrong URL prefix as well as we prefer this to be upper case due to convention.