NCATSTranslator / testing

Materials and tools for testing Translator components
1 stars 9 forks source link

Aicardi Syndrome (MONDO:0014007) Correlated With Diseases #58

Open sstemann opened 3 years ago

sstemann commented 3 years ago

Query: aicardiCorrelated PK: c9042e31-5862-4f17-8ced-057a3d33d01b MONDO:0014007 (Disease) Results Tracking Sheet

image

Results returned from: Unsecret Agent COHD

dkoslicki commented 3 years ago

The query appears to be a sub-type of Aicardi-Goutieres syndrome (MONDO:0018866). ARAX has an issue about super-classes open https://github.com/RTXteam/RTX/issues/1367, and plan to implement that once we resolve subclass_of issues. In the meant-time, replacing MONDO:0014007 with MONDO:0018866 does return one (not interesting) result: https://arax.ncats.io/?r=8503

More interesting is if you remove the (imho, ambiguous) correlated_with predicate. Then you get quite a few results: https://arax.ncats.io/?r=8504

@CaseyTa I would be interested in hearing how COHD returns results, but while we (utilizing COHD) but do not return results. Maybe some data is stale or our "hooks" into COHD are antiquated?

marcdubybroad commented 3 years ago

The genetics KP doesn't have disease to disease edges, although we are investigating including such data in the future.

rtroper commented 3 years ago

While we have disease-disease associations in our EHR risk KP, we don't currently have this disease. I checked the EHR we have access to and there are a few hundred individuals with this, so we might be able to add it in a future version of our KP.

vdancik commented 3 years ago

MolePro does not support answering such questions.

dkoslicki commented 3 years ago

@CaseyTa During our investigation of this issue, we discovered something interesting: the input CURIE MONDO:0014007 is Aicardi-Goutieres syndrome 6, but COHD returns "basal ganglia disease" which corresponds to MONDO:0003996 . I've noticed that when using the first query, we get no results from COHD, while we are able to get results when we use the second curie. Do you know why COHD is calling MONDO:0014007 basal ganglia disease?

I'm suspecting that it might be due to the large omop mapping distance of 3:

"n0": {
      "category": [
        "biolink:DiseaseOrPhenotypicFeature"
      ],
      "id": "MONDO:0014007",
      "mapped_omop_concept": {
        "distance": 3,
        "omop_concept_id": 378144,
        "omop_concept_name": "Disorder of basal ganglia"
      }
    }
dkoslicki commented 3 years ago

This now works for Expander Agent by using the following TRAPI 1.1 query graph:

{
  "message": {
    "query_graph": {
      "nodes": {
        "n0": {
          "ids": ["MONDO:0014007"],
          "categories": ["biolink:Disease"]
        },
        "n1": {
          "categories": ["biolink:Disease"]
        }
      },
      "edges": {
        "e01": {
          "subject": "n0",
          "object": "n1",
          "predicates": ["biolink:correlated_with"]
        }
      }
    }
  }
}

Results here: https://arax.ncats.io/?r=10333

CaseyTa commented 3 years ago

The mapping error is coming in from our call to EMBL-EBI's OxO. OxO is going from OMIM:615010 (Aicardi-Goutieres syndrome 6) to Orphanet:51 (Aicardi-Goutières syndrome) to ICD9CM:333.0 (Disorder of basal ganglia).

I've updated COHD to add better provenance on node mappings in the KG Node, with each array element being a step along various tools.

[
    {
        "distance": 1,
        "input_id": "MONDO:0014007",
        "input_label": "Aicardi-Goutieres syndrome 6",
        "output_id": "OMIM:615010",
        "source": "SRI Normalizer"
    },
    {
        "distance": 2,
        "input_id": "OMIM:615010",
        "output_id": "ICD9CM:333.0",
        "source": "EMBL-EBI OxO"
    },
    {
        "distance": 1,
        "input_id": "ICD9CM:333.0",
        "output_id": "OMOP:378144",
        "output_label": "Disorder of basal ganglia",
        "source": "OMOP"
    }
]