Open jvendetti opened 2 years ago
we might want to double check this is a pipeline error rather than in the mapping algorithm itself
I would only expect a single point of contact between the two, on this term, based on the highlighted exact synonym:
id: MONDO:0000001 name: disease or disorder def: "A disease is a disposition to undergo pathological processes that exists in an organism because of one or more disorders in that organism." [OGMS:0000031] synonym: "condition" EXACT [NCIT:C2991] synonym: "disease" EXACT [NCIT:C2991]
if somehow synonyms are not used in mapping then the expected number of mappings between the two may be zero
@jvendetti log: /srv/ncbo/ncbo_cron-ALTERNATE_4store/logs/mappings_20220303
Yes, you're right @sierra-moxon.
The "intermediary" triples that I've referred to in our telecons that the API uses to materialize LOOM mappings look like the following:
# MONDO graph:
<http://purl.obolibrary.org/obo/MONDO_0000001> <http://data.bioontology.org/metadata/def/mappingLoom> "diseaseordisorder"^^<http://www.w3.org/2001/XMLSchema#string> .
# BIOLINK graph:
<https://w3id.org/biolink/vocab/Disease> <http://data.bioontology.org/metadata/def/mappingLoom> "disease"^^<http://www.w3.org/2001/XMLSchema#string> .
... where the subject is the class ID, the predicate is the mapping type (in this case LOOM), and the object is the preferred name of the class in lowercase with all spaces and punctuation removed. The API finds no lexical match between MONDO's "diseaserordisorder" and BIOLINK's "disease".
@sierra-moxon reported that the following REST call returns an empty set:
http://data.bioontology.org/mappings?ontologies=BIOLINK,MONDO&display_context=false&display_links=false
@alexskr - is there any logging output for the last time you manually performed the regeneration of mapping counts described here: https://github.com/ncbo/bioportal-project/issues/210? I'd like to check the logs to see if any errors occurred with generation of pairwise counts for the BIOLINK + MONDO combination.
According to the code, the log output should be in a file called scheduler-mapping-counts.log, but it looks like this file hasn't been updated since April '21.