Open TranslatorIssueCreator opened 6 months ago
What genes may be downregulated by:Lepirudin
added as well in the chemical names asset sheet
To note that the returned CURIE is a CHEMBL target type and not the actual protein CURIE for Thrombin that should be CHEMBL:2108110.
Tested today at RENCI dev endpoints: NameRes Message:
{
"curies": [
"UniProtKB:P00734"
]
}
Response:
{
"UniProtKB:P00734": {
"curie": "UniProtKB:P00734",
"names": [
"F2",
"DCP",
"hF2",
"PIVKA-II",
"Factor II",
"EC 3.4.21.5",
"Prothrombin",
"F2 protein, human",
"prothrombin (human)",
"Coagulation Factor II",
"Des-Gamma Carboxyprothrombin",
"Des-Gamma-Carboxy Prothrombin",
"coagulation factor II (human)",
"THRB_HUMAN Prothrombin (sprot)",
"Protein Induced by Vitamin K Absence-II",
"Protein Induced by Vitamin K Absence/Antagonist-II",
"Protein Induced by Vitamin K Absence or Antagonist II"
],
"types": [
"Protein",
"GeneProductMixin",
"Polypeptide",
"ChemicalEntityOrGeneOrGeneProduct",
"ChemicalEntityOrProteinOrPolypeptide",
"BiologicalEntity",
"ThingWithTaxon",
"NamedThing",
"Entity",
"GeneOrGeneProduct",
"MacromolecularMachineMixin"
],
"preferred_name": "THRB_HUMAN Prothrombin (sprot)",
"shortest_name_length": 2,
"clique_identifier_count": 6,
"id": "82926183-9370-4db8-b0bd-99922e2f8fd1",
"_version_": 1796561715672907800
}
}
NodeNorm { "UNIPROTKB:P00734": { "id": { "identifier": "NCBIGene:2147", "label": "F2" }, "equivalent_identifiers": [ { "identifier": "NCBIGene:2147", "label": "F2" }, { "identifier": "ENSEMBL:ENSG00000180210", "label": "F2 (Hsap)" }, { "identifier": "HGNC:3535", "label": "F2" }, { "identifier": "OMIM:176930" }, { "identifier": "UMLS:C1414504", "label": "F2 gene" }, { "identifier": "UniProtKB:P00734", "label": "THRB_HUMAN Prothrombin (sprot)" }, { "identifier": "PR:P00734", "label": "prothrombin (human)" }, { "identifier": "ENSEMBL:ENSP00000308541" }, { "identifier": "ENSEMBL:ENSP00000308541.5" }, { "identifier": "UMLS:C3540506", "label": "Des-Gamma Carboxyprothrombin" }, { "identifier": "UMLS:C5552806", "label": "F2 protein, human" } ], "type": [ "biolink:Gene", "biolink:BiologicalEntity", "biolink:NamedThing", "biolink:GeneOrGeneProduct", "biolink:GenomicEntity", "biolink:ChemicalEntityOrGeneOrGeneProduct", "biolink:PhysicalEssence", "biolink:OntologyClass", "biolink:ThingWithTaxon", "biolink:PhysicalEssenceOrOccurrent", "biolink:MacromolecularMachineMixin", "biolink:Protein", "biolink:Polypeptide", "biolink:GeneProductMixin", "biolink:ChemicalEntityOrProteinOrPolypeptide" ], "information_content": 76.2 } }
At this point (perhaps just a versioning issue), I am not sure why name res is choosing "THRB_HUMAN Prothrombin (sprot)" as the preferred label given NodeNorm output...
To note that the returned CURIE is a CHEMBL target type and not the actual protein CURIE for Thrombin that should be CHEMBL:2108110.
Yup, that's the key to what's going on here! It looks like "UNIPROTKB:P00734" is the label for the identifier http://identifiers.org/chembl.target/CHEMBL204 (presumably CHEMBL.TARGET:CHEMBL204). However, we don't have a CHEMBL.TARGET:CHEMBL204 in NodeNorm at all. I'm searching through our Proteins to see if it has a different prefix.
At this point (perhaps just a versioning issue), I am not sure why name res is choosing "THRB_HUMAN Prothrombin (sprot)" as the preferred label given NodeNorm output...
This is because that NodeNorm output has gene-protein conflation turned on, so it returns the preferred ID of NCBIGene:2147 ("F2"). NameRes currently has gene-protein conflation turned off, so UNIPROTKB:P00734 returns the label for UniProtKB:P00734, which is "THRB_HUMAN Prothrombin (sprot)". If you look this up on NodeNorm Prod with gene-protein conflation turned off, you'll see it's more similar to the NameRes output.
We should still come up with a better name for NCBIGene:2147 ("F2"). I think we can try using the label prefix boosting to do this. I'm tracking this at https://github.com/TranslatorSRI/Babel/issues/312
However, we don't have a CHEMBL.TARGET:CHEMBL204 in NodeNorm at all. I'm searching through our Proteins to see if it has a different prefix.
We really don't have CHEMBL.TARGET:CHEMBL204 "UNIPROTKB:P00734" at all in NodeNorm, but I agree that that is a terrible label :)
@cbizon Looking at https://arax.test.transltr.io/?r=22907b79-f50e-42fe-a9ba-5da28b380cbc, it looks like CHEMBL.TARGET:CHEMBL204 is being returned by Aragorn -- do you know where it's coming from? Should we add CHEMBL.TARGET to NodeNorm?
It looks like those edges are coming from MolePro. Tagging @vdancik
I'll check what went wrong
Type: Bug Report
URL: https://ui.test.transltr.io/main/results?l=Lepirudin&i=PUBCHEM.COMPOUND:118856773&t=4&r=0&q=9c30d7a5-be5f-4f44-a12e-376369954f57
ARS PK: 9c30d7a5-be5f-4f44-a12e-376369954f57
Steps to reproduce:
search for UNIPROTKB:P00734 (current top answer)
Screenshots: