Open gglusman opened 6 months ago
Likely related to #657.
https://ui.test.transltr.io/main/results?l=Aggressive+Systemic+Mastocytosis&i=MONDO%3A0020333&t=0&r=f58a3afe&q=246f5c17-fd36-4980-92fe-e8cf105e53fc: top 52 results have 0 publications, 0 clinical trials, and 1 source (Unsecret Agent) supporting.
BTE is showing up as a primary knowledge source on a treats edge for Etanercept
It's not totally clear whether these are inferred edges missing their support graphs or lookup edges with a misassigned primary knowledge source, but in either case, the new UI in test will make this more straightforward to diagnose.
EDIT: Looks like a UI issue?
Regarding PRP + Etanercept: see BTE's response in ARAX-UI for this PK, result 2
PRP + Tretinoin / Fluconazole: see BTE's response in ARAX-UI for this PK: Tretinoin is result 4, Fluconazole is result 10
Note for Aggressive Systemic Mastocytosis and Unsecret Agent: I suspect it's the same problem as BTE's. Now the results are probably missing, and the ARAX-UI shows that Unsecret agent has inferred edges with support-graphs.
agree with @colleenXu's analysis -- I think BTE is reporting what it knows appropriately.
@gprice1129 do you agree that this is a UI issue?
@colleenXu @gprice1129 @andrewsu I think I see what the technical problem is. I don't think it is a UI problem.
I also don't think fixing it will really addresse @gglusman issue? I think the real issue here is the AEOLUS website for the primary source is unhelpful in finding out why that edge is there. I think this is related to the infores work that hopefully is getting better. Is that right, Gwênlyn?
I think I found the technical issue though! It looks like BTE may have missing edges. (Edit - I explain more below - it may be in the merge in ARS?)
The edge IDs suggest BTE meant this to be a 1-hop inferred treats? With one edge as the aux_graph. And Aragorn and ARAX meant there to be a 1-hop lookup edge. All pointing to aeolus as the knowledge source. Which makes this an odd case, but allowed and we would support it. If all 3 had just been a lookup then the display would definitely be correct. What is missing is the 'inferred' part.
The lack of support graph data is what breaks it according to Gus.
I see the following for BTE:
BTE analysis:
{
"score": 0.9504392119646858,
"attributes": null,
"resource_id": "infores:biothings-explorer",
"edge_bindings": {
"t_edge": [
{
"id": "b4eb32ffb57766c71724794168601b13",
"attributes": null
},
{
"id": "inferred-UNII:OP401G7OJC-treats-MONDO:0100017",
"attributes": null
}
]
},
"scoring_method": null,
"support_graphs": null
}
Edge 1
"inferred-UNII:OP401G7OJC-treats-MONDO:0100017": {
"object": "MONDO:0100017",
"sources": [
{
"resource_id": "infores:biothings-explorer",
"resource_role": "primary_knowledge_source",
"upstream_resource_ids": []
}
],
"subject": "UNII:OP401G7OJC",
"predicate": "biolink:treats",
"attributes": [
{
"value": [
"inferred-UNII:OP401G7OJC-treats-MONDO:0100017-support0"
],
"attribute_type_id": "biolink:support_graphs"
}
]
},
I don't see any edges for the graph "inferred-UNII:OP401G7OJC-treats-MONDO:0100017-support0". So, there would not be any paths shown under the infered for this. It would just be displayed like an infered edge with BTE as the knowledge source. Or it may break it, I am not sure.
Edge 2
"b4eb32ffb57766c71724794168601b13": {
"object": "MONDO:0100017",
"sources": [
{
"resource_id": "infores:aeolus",
"resource_role": "primary_knowledge_source",
"upstream_resource_ids": []
},
{
"resource_id": "infores:mychem-info",
"resource_role": "aggregator_knowledge_source",
"upstream_resource_ids": [
"infores:aeolus"
]
},
{
"resource_id": "infores:biothings-explorer",
"resource_role": "aggregator_knowledge_source",
"upstream_resource_ids": [
"infores:mychem-info"
]
}
],
"subject": "UNII:OP401G7OJC",
"predicate": "biolink:treats",
"attributes": []
},
This would show a lookup.
I'm not sure I understand @Genomewide - the ARAGORN example uses the attribute "biolink:support_graphs" on the inferred edge, just as BTE does. That, as I understand it, is the right attribute name in both cases. And both point to auxiallary graphs that have edges in them.
The only thing I can't verify for sure from this comment is whether the BTE aux graph "inferred-UNII:OP401G7OJC-treats-MONDO:0100017-support0" has the right edge in it.
But if it does, then I'm not clear on why the UI will show the aux graph for ARAGORN's and not BTE's result.
Also, I'm not clear on what your last comment there is referring to:
Also, should this be caught in ARAX and show a warning or something?
Maybe I'm missing the point here...
@cbizon You are right, I had to back that out of what I put above. I edited to remove it, but you may still see the old answer. I think BTE is just missing support graphs.
Here is the weird kicker! And I think Gus just figured it out!
ARAX displays this for the BTE result.
I only look at the merged JSON and not the individual ones. I bet it is getting cut in the merge. According to the data I see inferred-PUBCHEM.COMPOUND:3365-treats-MONDO:0100017-support0 has no edges. It is just referenced like the other answer above.
@MarkDWilliams can you check this?
Here is the ref link again.
https://arax.ci.transltr.io/?r=cfcdc63b-f49f-4ebd-bda1-c2510bd353f1
Oh ok, thanks @Genomewide !
I am so sorry you read all of that! I did not want to leave it in bc it was confusing so I rewrote history a bit, but you were too on top of it and got the incorrect and (what I hope are) correct parts.
Just to be extra clear why the BTE edges are missing. The current UI code treats the entire analysis as invalid if it can't find any of the referenced nodes, edges, or support graphs in the analysis no matter how many levels deep in the support graphs the missing reference occurs.
I also verified that in the raw message from BTE the support graphs referenced in the missing edges on the UI do appear in the auxiliary_graph
field. So this is definitely being removed somewhere in the ARS merge @MarkDWilliams.
Taking a look at this now to see what the root cause of the issue is.
Shervin was able to dig into these results (thanks @ShervinAbd92 !) and I believe she found the issue. Reposting her comments here as she's AFK for a bit.
In the removed_block function the aux_graph ” inferred-UNII:OP401G7OJC-treats-MONDO:0100017-support0" is added to the aux_graph_to_remove list since there is an overlap between “a8095addc72a5c9785059bda32cd940f and “MONDO:0100017-has_phenotype-MONDO:0005070-via_subclass”--> which is in the list of edges_to_remove, which has a “object” that is among nodes_to_remove list from the block list.
So, it looks like
MONDO:0005070
: "Tumor"We have a few options here as I see it, and I'm happy to facilitate whatever folks want to see.
For 3, I believe the behavior that we're seeing here is in-line with what got discussed on the TAQA breakout for how the blocking should work, but I'm happy to change that if, seeing it in action, we have different feelings about it. I'll lay out the broad strokes of what we have implemented below for clarity :
auxiliary_graphs
, and if it contains edges that we removed, we remove the whole aux graph. The thinking here was that aux graphs were often interconnected, and removing just one (or some subset of the total) edge would leave a graph that didn't make sense.results
. Results
that have a blocked node as part of their node_bindings
on the actual result
object just get removed entirely. That is if a result says "Water treats Diabetes", we remove the whole resultresult
is ok at the top-level, we move on to looking in the analyses
.edge_bindings
in the analysis that are among our bad edges, we remove just those.support_graph
in the analysis
(which is basically a list of edge_bindings) has any of our edges to be removed, we remove those from the support_graph
as well.result
that now has no analyses
(i.e. a result
that only had "bad" evidence supporting it), we remove that results
because we don't want to show something with zero evidence.Apologies for the long post with a series of lists, but I just wanted to make sure everything was as clear as possible. Does anyone have any thoughts on which option we should pursue?
This mostly makes sense, but if I understand it all, then the removal of Tumor should eventually have led to the remove of the result, but it didn't. Is that wrong?
I guess I also wonder about whether Tumor should be on the block list. For instance something like Chemical reduces Tumors and therefore treats Cancer X seems like a valid path?
Are we mixing up different uses for the blocklist i.e. is Tumor on there for another good reason that I'm not thinking of?
So it looks like most of the discussion is about BTE's PRP disease and Etanercept drug result.
Here's the screenshot for that (Andy's post shows a different result - Fluconazole drug):
And the aux-graph inferred-UNII:OP401G7OJC-treats-MONDO:0100017-support0
edges:
Mark said:
the Aux graph is slated to be removed because it contained an edge to be removed (and no other edges. If it had other "legitimate" edges, it would only have the blocked edges removed but the aux graph as a whole would remain.)
But it looks like this aux-graph has many edges that don't involve tumor MONDO:0005070
. So shouldn't the support graph have been kept - just with the tumor-edges + tumor-node removed from the aux-graph/knowledge-graph?
MONDO:0005070
= neoplasm here, not tumor?inferred-PUBCHEM.COMPOUND:3365-treats-MONDO:0100017-support0
than just the ones for tumor/neoplasm/MONDO:0005070
The initial thinking with this logic was that removing edges from aux graphs would leave aux graphs that were disconnected or didn't make sense. So, we remove the whole aux_graph. If folks want this behavior changed, we could remove just the tumor edge and leave the rest. It might just leave us with some funky aux graphs in the future.
@cbizon It would only "trickle up" to remove the whole result if removing this aux graph left us with a result that had no supporting evidence.
@MarkDWilliams Is there a time when this would leave us with an inferred edge that has no aux graphs?
Also, would be good to have @gprice1129 look at the explanation and see when he thinks our system would just boot the result. The reason this one still showed up was because others reported it. It would disappear if not.
A couple things I want to clarify:
My main point is that if the ARS is removing anything, it needs to systematically remove it everywhere.
Agree. Dangling references are a bug on the ARS side and should be fixed. Also agree with the overall principle that if you're removing something, you should remove all references to it.
from TAQA: this is an ARS issue in progress
Noting that Etanercept has a much lower score now, in the 2's vs ~5.
No "Tretinoin" but two with that in the name, one has a score of 5. It seems reasonable to me after taking a look at the publications. But also noting that it says it has 9 publications when really it only has 2.
What drugs may treat PRP, 2024/1/10 edition.
Result 3/960 is Etanercept (score 4.99). Evidence cites 0 publications, 0 clinical trials, and 2 sources. These are AEOLUS and BTE, each linking to their respective wiki pages. These links don't help get more evidence for the assertion. Result 6/960 is Tretinoin (score 4.96). Evidence: 0 pubs, 0 CTs, 1 source - BTE. Again, an EPC dead end. Same for result 12/960, Fluconazole (score 4.92), etc.