monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
6 stars 3 forks source link

mapping: some mapping are done even though there is already a Mondo:equivalentTo #220

Open sabrinatoro opened 1 year ago

sabrinatoro commented 1 year ago

from https://github.com/monarch-initiative/mondo-ingest/blob/main/src/ontology/lexmatch/unmapped_ncit_lex.tsv

subject_id subject_label predicate_id object_id object_label mapping_justification mapping_tool confidence subject_match_field object_match_field match_string comment
MONDO:0002870 tricuspid valve insufficiency MONDO:equivalentTo NCIT:C50843 Tricuspid Valve Regurgitation semapv:LexicalMatching oaklib 0.8 oio:hasExactSynonym rdfs:label tricuspid valve regurgitation LEXMATCH

This MONDO:0002870 has already another NCIT x-ref (NCIT:C50842) that is MONDO:equivalentTo

Is there something in place to check for this? or are we relying on the QC to catch when there is more than one exact match to a source?

sabrinatoro commented 1 year ago

These examples seem to be NCIT terms that have been "included in" the DO entry. example: MONDO:0003652 - 'acute urate nephropathy' Screenshot 2023-02-28 at 11 32 57 AM

--> the synonym from NCIT:C123245 are in Mondo (from DO) as exact synonyms, though this is not always correct. And also, since there is already NCIT:C123037, these matches probably should be questioned.

matentzn commented 1 year ago

But would you make a general rule out of this observation? "If there already is an equivalent on this class, do not propose a new one?" - Let me know if you want us to act here - we will do whatever you see fit, bit I am not sure in this case I would. QC should flag this up!

What about:

  1. Deliberate proxy merges (Mondo decides two classes in a source are actually the same thing)
  2. Mistakes in equivalent class mappings, which could be revealed this way?