Open TranslatorIssueCreator opened 1 year ago
A better example: "Pubchem.compound:151537" instead of "4'-Epidoxorubicin (hydrochloride)". Or: "Pubchem.compound:9841834" instead of "Istaroxime".
I queried the compound Pubchem.compound:151537
through NodeNorm endpoint and found that the label for this compound (decided by NodeNorm) is :
"(7S,9S)-7-[(2S,4S,5R,6S)-4-amino-5-hydroxy-6-methyloxan-2-yl]oxy-6,9,11-trihydroxy-9-(2-hydroxyacetyl)-4-methoxy-8,10-dihydro-7H-tetracene-5,12-dione"
So here is my guess of what is happening but I'll need confirmation from the UI team @dnsmith124 : when the label is too long, the decision was made to show the ID. Here the user is asking whether another rule could be used?
The second issue here is NodeNorm choosing not the optimal label. This is a known issue.
@gprice1129 can you speak to whether the backend is doing this with the names? The UI's frontend simply displays the names provided, and in the results I'm seeing from the example the 'Pubchem' terms are being given as the names for these results.
@sandrine-muller-research where does your preferred name come from?
@dnsmith124 @sandrine-m @MarkDWilliams the backend just takes the names we are given by the ARS. The ARS should be converting these names from CURIEs to whatever name is decided as the "best" one by NodeNorm.
@Genomewide from NodeNorm PROD endpoint : @MarkDWilliams does ARS make something on top of NodeNorm to decide the best label for the compound?
this still happens, i dont know if there is a solution @gaurav https://ui.test.transltr.io/main/results?l=VPS13B%20(Human)&i=NCBIGene:157680&t=1&r=0&q=d9bc14f5-c11a-4625-aef7-1ed76c3f7179
Here's how we're doing on NodeNorm CI:
I'm tracking non-good preferred names in this spreadsheet as well as https://github.com/TranslatorSRI/Babel/issues/306, but that work won't help these two cliques, because none of the other identifiers have a good label for this identifier. So we will probably need to pull in additional sources of labels and identifiers to fully fix this. I'm going to come back to this in Hammerhead, but unless there's a good source we're missing this will likely go unfixed this year.
(There's another ticket where we're discussing other solutions, such as having the UI display the CURIE -- "PUBCHEM.COMPOUND:151537" instead of "(7S,9S)-..." -- see https://github.com/NCATSTranslator/Feedback/issues/759)
it still happens a lot and is most obvious on a new query, since Improve and Unsecret are the fatest to return
Type: Bug Report
URL: https://ui.transltr.io/main/results?l=VPS13B%20(Human)&i=NCBIGene:157680&t=1&q=9c390038-73fc-4d7d-8c63-c08d85eda8b0
ARS PK: 9c390038-73fc-4d7d-8c63-c08d85eda8b0
Steps to reproduce:
Search for drugs that upregulate VPS13B
Screenshots: