monarch-initiative / mondo

Mondo Disease Ontology
http://obofoundry.org/ontology/mondo
Creative Commons Attribution 4.0 International
224 stars 53 forks source link

How to represent linkouts vs cross-references vs mappings in Mondo #7071

Open matentzn opened 8 months ago

matentzn commented 8 months ago

In the huddle this week we talked a bit about the differences between mappings, cross-references and linkouts.

image

The time has come to develop an air-tight datamodel for this, and align it with biolink, cc @sierra-moxon.

Right now, we have roughly these structures:

  1. Mappings:

Mappings are shipped using skos:exactMatch.

property_value: exactMatch DOID:4

Things to consider:

  1. We do not currently publish the mapping inside the ontology with SSSOM metadata, but we probably could

  2. We redundantly publish "mappings" using xrefs (see "Cross-references" below) and "skos:exactMatch") which may be good (German saying: Doppelt haelt besser, but could result in confusions "which to use".

  3. Cross-references

Cross-references use the xref property in OBO format, and then a list of "source" properties to denote "confirmatory evidence" (who said this?) and custom provenance like "what semantic precision is this?". The fact that both of these use the same property ("source") is a bit unfortunate tbh and will make it harder for UIs to display the "right provenance information".

xref: UMLS:C4310752 {source="MONDO:equivalentTo", source="MONDO:ncbi_mim2gene_medline"}
  1. Linkouts

Linkouts use "see the see also" as a property value, with xsd:anyURI annotation and a "source" property to denote the source.

property_value: seeAlso "https://rarediseases.info.nih.gov/diseases/10691/duane-syndrome-type-3" xsd:anyURI {source="GARD:0010691"}

@sabrinatoro was floating the idea of using a new property that is more specific, which is an option, but could prove difficult to align with other stacks.

This issue is purely about the representation of these three datatypes. For more on how to display these links see: https://github.com/monarch-initiative/monarch-app/issues/521.

cmungall commented 8 months ago

I think your 3 is just axiom annotations? Aka provenance. Not sure we need a new overloaded term for this

Linkouts already has a meaning eg ncbi linkouts. And do we have so many seealsos?

On Thu, Dec 21, 2023 at 2:14 AM Nico Matentzoglu @.***> wrote:

In the huddle this week we talked a bit about the differences between mappings, cross-references and linkouts. image.png (view on web) https://github.com/monarch-initiative/mondo/assets/7070631/4994475f-26f4-4035-b154-9b9d0c7ee1eb

The time has come to develop an air-tight datamodel for this, and align it with biolink, cc @sierra-moxon https://github.com/sierra-moxon.

Right now, we have roughly these structures:

  1. Mappings:

Mappings are shipped using skos:exactMatch.

property_value: exactMatch DOID:4

Things to consider:

1.

We do not currently publish the mapping inside the ontology with SSSOM metadata, but we probably could 2.

We redundantly publish "mappings" using xrefs (see "Cross-references" below) and "skos:exactMatch") which may be good (German saying: Doppelt haelt besser https://de.wiktionary.org/wiki/doppelt_h%C3%A4lt_besser, but could result in confusions "which to use". 3.

Cross-references

Cross-references use the xref property in OBO format, and then a list of "source" properties to denote "confirmatory evidence" (who said this?) and custom provenance like "what semantic precision is this?". The fact that both of these use the same property ("source") is a bit unfortunate tbh and will make it harder for UIs to display the "right provenance information".

xref: UMLS:C4310752 {source="MONDO:equivalentTo", source="MONDO:ncbi_mim2gene_medline"}

  1. Linkouts

Linkouts use "see the see also" as a property value, with xsd:anyURI annotation and a "source" property to denote the source.

property_value: seeAlso "https://rarediseases.info.nih.gov/diseases/10691/duane-syndrome-type-3" xsd:anyURI {source="GARD:0010691"}

@sabrinatoro https://github.com/sabrinatoro was floating the idea of using a new property that is more specific, which is an option, but could prove difficult to align with other stacks.

This issue is purely about the representation of these three datatypes. For more on how to display these links see: monarch-initiative/monarch-app#521 https://github.com/monarch-initiative/monarch-app/issues/521.

— Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/7071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOPWDX2LKAIPN54TKTTYKQD2DAVCNFSM6AAAAABA6E5TXOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2TEMJTGEZDSNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

matentzn commented 8 months ago

I don't think its "overloaded" in the sense that they are not clearly distinguishable. I am not entirely opposed to use xrefs for doing case three, but case three is a conceptually clearly different case (as I defined in the huddle on Tuesday). Look at this:

https://search.clinicalgenome.org/kb/conditions/MONDO:0020119

How will we add this link to Mondo? as an xref: CLINGEN:MONDO:0020119?

cmungall commented 8 months ago

My default position is to treat this as a UI thing and have a mechanism like NCBI linkouts

However, I see the point that including in the ontology (presumably via automation) will make it independent of the Monarch UI, e.g. linkouts would show in OLS, OntoBee, Bioportal, ...

On Thu, Dec 21, 2023 at 8:44 AM Nico Matentzoglu @.***> wrote:

I don't think its "overloaded" in the sense that they are not clearly distinguishable. I am not entirely opposed to use xrefs for doing case three, but case three is a conceptually clearly different case (as I defined in the huddle on Tuesday). Look at this:

https://search.clinicalgenome.org/kb/conditions/MONDO:0020119

How will we add this link to Mondo? as an xref: CLINGEN:MONDO:0020119?

— Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/7071#issuecomment-1866627943, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOLDUDTETFUKQ34R5CTYKRRPJAVCNFSM6AAAAABA6E5TXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWGYZDOOJUGM . You are receiving this because you commented.Message ID: @.***>

matentzn commented 8 months ago

Exactly - this has been also specifically requested by three of our collaborators to give these links wider exposure in "our world". The UI would have to know a lot of complicated things, like, when a linkout is available (for which class), while we "know" this because of our metadata and subsets. I think this is best done in the ontology - either as xrefs or seealsos (we have both of these right now).

cmungall commented 8 months ago

I see the advantages. But I always worry about doing things a non-conventional way. Users are not accustomed to seeing these annotations in ontologies. Tools may not even do the right thing. It feels too tight coupling. It's different from how other ontologies that face the same problem do things (e.g. GO, all the MOD ontologies - e.g. MP would want linkouts to MGI). Is our plan to prototype this in Mondo and promulgate in other ontologies? It places a lot of data-oriented pipeline in the ontology technical group, who may already be overloaded...

On Thu, Dec 21, 2023 at 9:41 AM Nico Matentzoglu @.***> wrote:

Exactly - this has been also specifically requested by three of our collaborators to give these links wider exposure in "our world". The UI would have to know a lot of complicated things, like, when a linkout is available (for which class), while we "know" this because of our metadata and subsets. I think this is best done in the ontology - either as xrefs or seealsos (we have both of these right now).

— Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/7071#issuecomment-1866699284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOO4XFD4AIUPKJEKQA3YKRYDFAVCNFSM6AAAAABA6E5TXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWGY4TSMRYGQ . You are receiving this because you commented.Message ID: @.***>

matentzn commented 8 months ago

I am trying to share your concerns.. :P But cant. seeAlso is a very typical mechanism, used widely. They are also not going to confuse anyone because most people will ignore them. But they will be displayed in all generic browsers! There is also no big overhead for curating these as they can all be autogenerated with a 5 line query..

cmungall commented 8 months ago

I'm probably misunderstanding. I assumed you were going to implement something like NCBI linkouts in the ontology. E.g. clingen provide all Mondo IDs they have data for. We autogenerate some kind of annotation axioms for each of these, refreshing with each ontology release.

On Thu, Dec 21, 2023 at 10:28 AM Nico Matentzoglu @.***> wrote:

I am trying to share your concerns.. :P But cant. seeAlso is a very typical mechanism, used widely https://api.triplydb.com/s/NxGqn3SuN. They are also not going to confuse anyone because most people will ignore them. But they will be displayed in all generic browsers! There is also no big overhead for curating these as they can all be autogenerated with a 5 line query..

— Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/mondo/issues/7071#issuecomment-1866761082, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOKQIPFHY6ZTA7OWKWTYKR5VTAVCNFSM6AAAAABA6E5TXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWG43DCMBYGI . You are receiving this because you commented.Message ID: @.***>

twhetzel commented 2 months ago

@matentzn is this still an active discussion? Should we add it to the next Tech call?

sabrinatoro commented 2 months ago

See discussion and decisions here