NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

multiple nodes for Glatimer #576

Open TranslatorIssueCreator opened 1 year ago

TranslatorIssueCreator commented 1 year ago

Type: Bug Report

URL: https://ui.transltr.io/main/results?l=Multiple%20Sclerosis&i=MONDO:0005301&t=0&q=7a9df8f8-b08d-483e-bcba-f01d2903579f

ARS PK: 7a9df8f8-b08d-483e-bcba-f01d2903579f

Steps to reproduce:

MVP1 for multiple sclerosis filter for Glatimer

Screenshots:

gglusman commented 1 year ago

'Glatimer' finds nothing, but 'Glatiramer' does indeed find duplicate results. 1st 'glatiramer': http://identifiers.org/unii/U782C039QP 2nd 'glatiramer': http://identifiers.org/pubchem.compound/65370 1st 'glatiramer acetate': http://identifiers.org/chembl.compound/CHEMBL1201507 2nd 'glatiramer acetate': http://identifiers.org/pubchem.compound/3081884 3rd 'glatiramer acetate': http://identifiers.org/umls/C0289884 3rd 'glatiramer': http://identifiers.org/drugbank/DB05259

gglusman commented 1 year ago

Noting also one result for 'dexamethasone' and two for 'dexamethasone sodium phosphate'.

gglusman commented 1 year ago

Actually there's plenty of duplicates. Even the top result 'mitoxanthrone' is returned six times, three of which have the exact same name 'Mitoxanthrone', one is 'Mitoxanthrone hydrochloride' and the other two have UNII and Chembl.compound curies displayed.

sandrine-muller-research commented 1 year ago

On Glatimer: NameRes does not output anything for this compound NodeNorm does not normalize UNII and drug bank IDs

cbizon commented 1 year ago

Yeah, pretty strange. At prod nodenorm, none of the Mitoxanthrone identifiers seem to be returning anything but null. @gaurav could there be a load problem for prod NN?

On dev nn, Mitoxanthrone is nicely normalized, and if you use the chemical conflation, it even pulls in the mitoxanthrone hydrochloride.

gaurav commented 3 months ago

NameRes Prod now returns only a single result for Mitoxanthrone -- and it's a small molecule! https://name-lookup.ci.transltr.io/lookup?string=Mitoxanthrone&autocomplete=true&offset=0&limit=10

Re: Glatiramer, it's less pretty picture: with drug_chemical_conflation turned on, we return four cliques:

I don't think we want to conflate glatiramer with glatiramer acetate (unless we do?), but we definitely want to reduce this to two cliques.

Estimating for Hammerhead.

gaurav commented 3 months ago

Noting also one result for 'dexamethasone' and two for 'dexamethasone sodium phosphate'.

This also appears to be fixed with drug-chemical conflation (see NameLookup CI result), but we're calling this conflated clique "dexamethasone acetate" for some reason. I've filed that as https://github.com/TranslatorSRI/Babel/issues/318