hubmapconsortium / ontology-api

The HuBMAP Ontology Service
MIT License
4 stars 3 forks source link

Ontology: remove SUIs #170

Closed AlanSimmons closed 1 year ago

AlanSimmons commented 1 year ago

Issue

The framework currently assumes the relevance of SUIs (permanent string identifiers). For codes in "ontologies" that are not in the UMLS, the framework generates unique base64 SUIs. The framework associates a SUI with every term node.

We have decided that the SUIs are no longer important: what matters is the actual term string, which corresponds to the name property of the Term node.

The request is to remove the dependency on SUIs from the framework. We anticipate that this will reduce both the complexity of the UBKG (e.g., by obviating the need for a SUIs.CSV file) and the size of the CSV source (by up to around 1 GB).

Solution

  1. Modify the OWLNETS-to-CSV script (OWLNETS-UMLS-GRAPH.py) so that it no longer works with SUI-related files. This may entail changes to assignments between CUIs and terms.
  2. Modify the script that exports the UMLS data from Neptune to "seed CSVs".

Risks

This is a fundamental change to the functioning of the generation framework. Regression testing will be required.

AlanSimmons commented 1 year ago

I am recommending that we not remove SUIs.

Email attached Mail - Simmons, Alan - Outlook.pdf

AlanSimmons commented 1 year ago

Moved to UBKG.