Open matentzn opened 1 year ago
Uberon:nerve --[sempav:crossSpeciesExactMatch]--> Xenopus:nerve Uberon:nerve --[sempav:crossSpeciesExactMatch]--> Xenopus:peripheral_nerve
Where do those semapv:crossSpeciesExactMatch
come from?
As far as I know the Uberon/XAO mappings are still expressed as oboInOwl:hasDbXref
, they have not switched to SSSOM and SEMAPV properties yet.
Anyway, while it could be that merging the two XAO terms upstream is the appropriate solution in this particular case, I want to point out that SSSOM and SEMAPV will offer a greater flexibility in this kind of cases.
Currently, the meaning of cross-species mappings (and so the type of bridging axioms) is decided once and for all for every taxon-specific ontology. In the case of XAO for example, we have this declaration:
treat-xrefs-as-reverse-genus-differentia: XAO part_of NCBITaxon:8353
which means that any XAO term mapped to an Uberon term is equivalent to the Uberon term (with the added restriction on taxon). Subsequently, any two XAO terms that are mapped to the same Uberon term (as in the example above) would end up being equivalent.
But with SSSOM and SEMAPV, we no longer need to treat all mappings with a given foreign ontology as if they were all of the same type. We can use semapv:crossSpeciesExactMatch
for mappings that are indeed supposed to be between equivalent terms (again modulo the taxon restriction), and other mapping properties for mappings between terms that should not be equivalent.
For example, we could have
Uberon:nerve --[sempav:crossSpeciesCloseMatch]--> Xenopus:nerve
Uberon:nerve --[sempav:crossSpeciesCloseMatch]--> Xenopus:peripheral_nerve
which could be translated as Xenopus:nerve
and Xenopus:peripheral_nerve
being subclasses of Uberon:nerve
, rather than equivalent classes (thereby not leading to an undesired equivalence between the two XAO terms).
Not saying this would be the right thing to do in this particular case (maybe XAO should revise their concepts of nerves, I don’t know), but let’s not be too fast in telling foreign ontologies that they should start merging terms just to keep Uberon happy. SSSOM gives us other solutions that may in some cases be better.
Where do those semapv:crossSpeciesExactMatch come from?
It was just me projecting into the future. 🔥 They are indeed hasDbXrefs.
Fully agreed with the rest - this is not how bridge generation works right now, but this is exactly what I think we should go towards (ok, I would use broadMatch, not closeMatch most likely, but we get to this when we get to this).
We will have to separate genuine proxy merges from false ones using manual curation, but I am assuming there wont be too many.
This issue has not seen any activity in the past 6 months; it will be closed automatically one year from now if no action is taken.
This issue has not seen any activity in the past 6 months; it will be closed automatically one year from now if no action is taken.
but I am assuming there wont be too many.
With sssom-cli
(included in the latest ODK) you can quickly find out all the cases for a given foreign ontology.
First make sure you have the tmp/uberon-mappings.sssom.tsv
file (containing all the mappings extracted from cross-references in Uberon); that file is normally automatically generated as part of the bridge pipeline, but if needed:
sh run.sh make tmp/uberon-mappings.sssom.tsv
Then to get all the XAO terms that are mapped to more than one Uberon/CL term:
sh run.sh sssom-cli --prefix-map-from-input \
-i tmp/uberon-mappings.sssom.tsv -i mappings/cl-mappings.sssom.tsv \
-R '!object==XAO:* -> stop()' -R 'cardinality==1:n' -> include()'
This will (1) read the Uberon mappings (re-generated above) and the CL mappings (already stored in the repository in mappings/cl-mappings.sssom.tsv
), (2) filters out any mapping whose object is not a XAO term (!object==XAO:* -> stop()
), and (3) out of the remaining mappings, selects only those with a cardinality of 1:n
, meaning many objects mapped to the same subject (cardinality==1:n
).
Output:
#curie_map:
# UBERON: "http://purl.obolibrary.org/obo/UBERON_"
# XAO: "http://purl.obolibrary.org/obo/XAO_"
#mapping_set_id: "http://purl.obolibrary.org/obo/uberon/core/mappings.sssom.tsv"
#license: "http://creativecommons.org/licenses/by/3.0/"
subject_id subject_label predicate_id object_id mapping_justification
UBERON:0001021 nerve semapv:crossSpeciesExactMatch XAO:0000204 semapv:UnspecifiedMatching
UBERON:0001021 nerve semapv:crossSpeciesExactMatch XAO:0003047 semapv:UnspecifiedMatching
UBERON:0001675 trigeminal ganglion semapv:crossSpeciesExactMatch XAO:0000427 semapv:UnspecifiedMatching
UBERON:0001675 trigeminal ganglion semapv:crossSpeciesExactMatch XAO:0000428 semapv:UnspecifiedMatching
UBERON:0001785 cranial nerve semapv:crossSpeciesExactMatch XAO:0000429 semapv:UnspecifiedMatching
UBERON:0001785 cranial nerve semapv:crossSpeciesExactMatch XAO:0003089 semapv:UnspecifiedMatching
UBERON:0002100 trunk semapv:crossSpeciesExactMatch XAO:0000054 semapv:UnspecifiedMatching
UBERON:0002100 trunk semapv:crossSpeciesExactMatch XAO:0003025 semapv:UnspecifiedMatching
UBERON:0003071 eye primordium semapv:crossSpeciesExactMatch XAO:0000227 semapv:UnspecifiedMatching
UBERON:0003071 eye primordium semapv:crossSpeciesExactMatch XAO:0004090 semapv:UnspecifiedMatching
UBERON:0004535 cardiovascular system semapv:crossSpeciesExactMatch XAO:0000100 semapv:UnspecifiedMatching
UBERON:0004535 cardiovascular system semapv:crossSpeciesExactMatch XAO:0001010 semapv:UnspecifiedMatching
UBERON:0005487 vitelline vein semapv:crossSpeciesExactMatch XAO:0000376 semapv:UnspecifiedMatching
UBERON:0005487 vitelline vein semapv:crossSpeciesExactMatch XAO:0004147 semapv:UnspecifiedMatching
UBERON:0005870 olfactory pit semapv:crossSpeciesExactMatch XAO:0000275 semapv:UnspecifiedMatching
UBERON:0005870 olfactory pit semapv:crossSpeciesExactMatch XAO:0004073 semapv:UnspecifiedMatching
Here you are. Eight cases in XAO.
Likewise for ZFA:
sh run.sh sssom-cli --prefix-map-from-input \
-i tmp/uberon-mappings.sssom.tsv -i mappings/cl-mappings.sssom.tsv \
-R '!object==ZFA:* -> stop()' -R 'cardinality==1:n -> include()'
#curie_map:
# UBERON: "http://purl.obolibrary.org/obo/UBERON_"
# ZFA: "http://purl.obolibrary.org/obo/ZFA_"
#mapping_set_id: "http://purl.obolibrary.org/obo/uberon/core/mappings.sssom.tsv"
#license: "http://creativecommons.org/licenses/by/3.0/"
subject_id subject_label predicate_id object_id mapping_justification
UBERON:0000165 mouth semapv:crossSpeciesExactMatch ZFA:0000547 semapv:UnspecifiedMatching
UBERON:0000165 mouth semapv:crossSpeciesExactMatch ZFA:0000590 semapv:UnspecifiedMatching
UBERON:0001230 glomerular capsule semapv:crossSpeciesExactMatch ZFA:0005254 semapv:UnspecifiedMatching
UBERON:0001230 glomerular capsule semapv:crossSpeciesExactMatch ZFA:0005310 semapv:UnspecifiedMatching
UBERON:0001286 Bowman's space semapv:crossSpeciesExactMatch ZFA:0005283 semapv:UnspecifiedMatching
UBERON:0001286 Bowman's space semapv:crossSpeciesExactMatch ZFA:0005312 semapv:UnspecifiedMatching
UBERON:0003054 roof plate semapv:crossSpeciesExactMatch ZFA:0001436 semapv:UnspecifiedMatching
UBERON:0003054 roof plate semapv:crossSpeciesExactMatch ZFA:0007058 semapv:UnspecifiedMatching
UBERON:2000089 actinotrichium semapv:crossSpeciesExactMatch ZFA:0000089 semapv:UnspecifiedMatching
UBERON:2000089 actinotrichium semapv:crossSpeciesExactMatch ZFA:0005435 semapv:UnspecifiedMatching
Five cases in ZFA.
Very nice analysis @gouttegd!
Do you think we should try to fix all proxy merges (I am assuming they are always an indication of "wrong")? And then add proxy-merge checking explicitly to QC?
I think I could justify passing the proxymerge review to a curator to fix. Maybe Ray or Arwa.
To streamline 1, it would be good if we could run runoak fill-table
or some such to add object_label
, but I guess they could also learn to do this themselves, now that sssom-java is in ODK.
This issue has not seen any activity in the past 6 months; it will be closed automatically one year from now if no action is taken.
To streamline 1, it would be good if we could run runoak fill-table or some such to add object_label
With SSSOM-Java >= 0.7.7 (in ODK 1.5.2), you can do:
$ sssom-cli -p -i ../mappings/uberon.sssom.tsv \
--exclude '!object==ZFA:*' \
--include 'cardinality==1:n' \
--update-from-ontology=imports/local-zfa.owl:object \
--catalog none
This will (1) read the mappings, (2) exclude any mapping whose object is not in ZFA, (3) include only the mappings where many objects are mapped to a same subject, and (4) update the resulting mapping set by filling the object labels with labels from the ZFA ontology:
(The --catalog none
option is to prevent sssom-cli
from trying to read Uberon’s catalog-v001.xml
. That file has a syntax error that SSSOM-Java, which uses a stricter parser than ROBOT or Protégé, refuses to silently ignore.)
#curie_map:
# ORCID: https://orcid.org/
# UBERON: http://purl.obolibrary.org/obo/UBERON_
# ZFA: http://purl.obolibrary.org/obo/ZFA_
# obo: http://purl.obolibrary.org/obo/
#mapping_set_id: http://purl.obolibrary.org/obo/uberon/core/mappings.sssom.tsv
#creator_id:
# - ORCID:0000-0002-1373-1705
# - ORCID:0000-0002-6095-8718
#license: http://creativecommons.org/licenses/by/3.0/
#subject_source: obo:uberon/core.owl
#object_source: obo:zfa.owl
subject_id subject_label predicate_id object_id object_label mapping_justification
UBERON:0000165 mouth semapv:crossSpeciesExactMatch ZFA:0000547 mouth semapv:UnspecifiedMatching
UBERON:0000165 mouth semapv:crossSpeciesExactMatch ZFA:0000590 oral region semapv:UnspecifiedMatching
UBERON:0001230 glomerular capsule semapv:crossSpeciesExactMatch ZFA:0005254 renal glomerular capsule semapv:UnspecifiedMatching
UBERON:0001230 glomerular capsule semapv:crossSpeciesExactMatch ZFA:0005310 pronephric glomerular capsule semapv:UnspecifiedMatching
UBERON:0001286 Bowman's space semapv:crossSpeciesExactMatch ZFA:0005283 renal capsular space semapv:UnspecifiedMatching
UBERON:0001286 Bowman's space semapv:crossSpeciesExactMatch ZFA:0005312 pronephric capsular space semapv:UnspecifiedMatching
UBERON:0003054 roof plate semapv:crossSpeciesExactMatch ZFA:0001436 roof plate neural tube regionsemapv:UnspecifiedMatching
UBERON:0003054 roof plate semapv:crossSpeciesExactMatch ZFA:0007058 roof plate semapv:UnspecifiedMatching
UBERON:2000089 actinotrichium semapv:crossSpeciesExactMatch ZFA:0000089 fin fold actinotrichium semapv:UnspecifiedMatching
UBERON:2000089 actinotrichium semapv:crossSpeciesExactMatch ZFA:0005435 actinotrichium semapv:UnspecifiedMatching
We have a bunch of cases like:
This means that an external ontology considered two concepts distinct that Uberon believes are the same.
As @cmungall commented in #2833, we should probably review these proxy merges and make issue tracker items for all of these, encouraging the external ontologies (like in this case XAO) to merge the terms.