NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

There are two results for Levothyroxine, which should be combined. https://ui.transltr.io/main/results?l=Hypothyroidism&i=MONDO:0005420&t=0&r=0&q=3b2da60c-9056-43f5-9001-ce9ec1cb3294 #757

Closed TranslatorIssueCreator closed 2 months ago

TranslatorIssueCreator commented 5 months ago

Type: Bug Report

URL: https://ui.transltr.io/main/results?l=Hypothyroidism&i=MONDO:0005420&t=0&r=0&q=3b2da60c-9056-43f5-9001-ce9ec1cb3294

ARS PK: 3b2da60c-9056-43f5-9001-ce9ec1cb3294

Steps to reproduce:

Search for "What drugs may treat Hypothyroidism?"

Screenshots:

gglusman commented 5 months ago

Looks like one result is a drug, the other is identified as a protein (weird).

Tested the same in ci, and it again is duplicate, but one is a 'drug' and the other is a 'small molecule'.

Noting that there's a third result in both cases, 'levothyroxine sodium'.

sandrine-muller-research commented 5 months ago

The 2 IDs reported separately for Levothyroxine are: PUBCHEM.COMPOUND:5819 and UMLS:C0040165 In Dev instances, NodeNorm is normalizing properly the 2 IDs:

{
  "PUBCHEM.COMPOUND:5819": {
    "id": {
      "identifier": "CHEBI:18332",
      "label": "L-thyroxine"
    },
    "equivalent_identifiers": [
      {
        "identifier": "CHEBI:18332",
        "label": "L-thyroxine"
      },
      {
        "identifier": "CHEBI:30660",
        "label": "thyroxine"
      },
      {
        "identifier": "PUBCHEM.COMPOUND:5819",
        "label": "Levothyroxine"
      },
      {
        "identifier": "PUBCHEM.COMPOUND:853",
        "label": "DL-Thyroxine"
      },
      {
        "identifier": "CHEMBL.COMPOUND:CHEMBL1624",
        "label": "LEVOTHYROXINE"
      },
      {
        "identifier": "UNII:Q51BO43MG4",
        "label": "LEVOTHYROXINE"
      },
      {
        "identifier": "DRUGBANK:DB00451"
      },
      {
        "identifier": "MESH:D013974",
        "label": "Thyroxine"
      },
      {
        "identifier": "CAS:300-30-1"
      },
      {
        "identifier": "CAS:51-48-9"
      },
      {
        "identifier": "CAS:7488-70-2"
      },
      {
        "identifier": "DrugCentral:2646",
        "label": "levothyroxine"
      },
      {
        "identifier": "GTOPDB:2635",
        "label": "T4"
      },
      {
        "identifier": "HMDB:HMDB0000248",
        "label": "Thyroxine"
      },
      {
        "identifier": "KEGG.COMPOUND:C01829",
        "label": "Thyroxine"
      },
      {
        "identifier": "INCHIKEY:XUIIKFGFIJCVMT-LBPRGKRZSA-N"
      },
      {
        "identifier": "UMLS:C0040165",
        "label": "levothyroxine"
      },
      {
        "identifier": "RXCUI:10582"
      }
    ],
    "type": [
      "biolink:SmallMolecule",
      "biolink:MolecularEntity",
      "biolink:ChemicalEntity",
      "biolink:NamedThing",
      "biolink:PhysicalEssence",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProteinOrPolypeptide",
      "biolink:PhysicalEssenceOrOccurrent"
    ],
    "information_content": 78.5
  }
}

NodeRes could not map the PUBCHEM ID but only the UMLS.

gaurav commented 2 months ago

Both PUBCHEM.COMPOUND:5819 and UMLS:C0040165 are merged into a single clique as CHEBI:18332 "T4" all the way up to NodeNorm Prod, so I'm going to close this issue. But if other L-thyroxines show up, please let me know!