Closed rtroper closed 2 years ago
@amykglen Is this a Genetics Provider issue? In spot checking, it appears that all the spurious edges are from them, and not from RTX-KG2. @NCATSTranslator/genetics-provider Are you aware of this error?
@amykglen , perhaps a quick fix is to have expand
check if the returned results are synonymous with the node given in the input query.
yes, I concur with @dkoslicki that this seems to be a Genetics KP issue.
one problem with @dkoslicki's proposed patch is that they could be rightfully returning some diseases that are subclasses of systemic scleroderma, and our synonymization wouldn't know that those are OK. though I suppose we could accept throwing those out as part of the temporary patch.
does anyone at Genetics KP have an estimate as to when this could be addressed? (@marcdubybroad) depending on that we can decide if it's worth putting a patch in place on our end.
I'll look into this after the 3pm meeting.
We do have an issue where we return the original submitted curie but return the descendant disease name which we are in process of fixing. Could this be causing this issue?
@marcdubybroad I don't think descendants is the issue here: these diseases are not descendants of scleroderma
If I submit the following one hop query to the genetics kp, I get no results returned. I assume that other curies are being provided to the genetics kp. Does anyone have these?
{ "message": { "query_graph": { "edges": { "e00": { "subject": "n00", "object": "n01" } }, "nodes": { "n00": { "categories": ["biolink:Gene"] }, "n01": { "ids": ["MONDO:0005100"] } } } } }
yes, here's the problem query (for qedge e02
): https://arax.ncats.io/api/arax/v1.2/status?id=21952
Fixed and deployed. Will create a unit test for this issue for future integration testing.
awesome, thanks @marcdubybroad! confirmed the problem appears resolved in the larger ARAX query: https://arax.ncats.io/?r=27989 (looks like Genetics KP doesn't find any answers for e02
)
think this issue should be good to close, @rtroper?
Excellent, thank you, everyone! That was fast. The results look great now. I'll go ahead and close it.
Query C.2 was run using systemic scleroderma (MONDO:0005100) as the disease node and tocilizumab (CHEMBL.COMPOUND:CHEMBL1237022) as the specified drug node. Results are here: https://arax.ncats.io/?r=27582.
Several result graphs legitimately contain systemic scleroderma e.g. see Results 3 - 5 (methotrexate, prednisone, cyclophosphamide). However, there are also several results that have other diseases or concepts in place of systemic scleroderma. Below, are some examples.
Upon closer inspection, it appears that for each of the unique drug results (node 3 in the query), a result exists for systemic scleroderma as well as each of the conditions/concepts above (lymphocyte count, hypothyroidism, Crohn's disease, leukocyte count). So, it may be that the set of unique drug results are legitimate, but that the replicate results with these other diseases/concepts are the consequence of a disease synonymization/normalization bug somewhere.
I'm not sure where this issue is arising, so I'm not sure who to assign to. For now, assigning to the ARAX group since the linked results, above, come from ARAX.
Here is the full query for reference: