Closed AlanSimmons closed 1 year ago
I modified the script. This entailed a couple of minor related changes to the generation scripts for HUBMAP and UNIPROTKB. Unit testing.
Increasing the dependency on the current source of information for RO (ro.json) required the addressing of two types of flaws in ro.json:
Some relations do not have inverses. The script creates a "pseudo-inverse" in the form of a relationship with prefix "inverse_". (The earlier script also did this, but missed some relationships.) Some relations had incomplete information regarding their inverses. For example, RO_0002206 (expressed in) is listed as the inverse of RO_0002292 (expresses), but RO_0002292 is not listed as the corresponding inverse of RO_0002206. The script can now identify the appropriate inverse relationship instead of just creating a pseudo-inverse.
Results of Regression testing here.
The new relations code resulted in improvements in the identification of relationships in the resulting knowledge graph.
Issue
The OWLNETS-UMLS-GRAPH script assumes the presence of 3 files that are the output of the PheKnowLator-based OWL-OWLNETS converter:
We recently started the Data Distillery (DD) project. We agreed to use a single code base for the ontology generation framework--i.e., ontology graphs for DD, HuBMAP, SenNet, etc. would be generated identically.
For DD, we specified only two files:
The assumption was that the information in OWLNETS_relations.txt was redundant, and could be derived from the predicate field of OWLNETS_edges.txt.
Solution
It will be necessary to modify the OWLNETS-UMLS-GRAPH script so that it does not depend on the relations file to obtain relationship information.
Unintended consequences that need to be addressed