NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Inferred within Inferred #776

Closed sstemann closed 1 month ago

sstemann commented 5 months ago

Is this allowed?

image

https://ui.test.transltr.io/main/results?l=Maturity-onset%20Diabetes%20Of%20The%20Young&i=MONDO:0018911&t=0&r=0&q=45260776-7bff-41c0-a8cb-132ba7ba5974

dnsmith124 commented 5 months ago

As far as I'm aware there's nothing that prohibits this from happening, but I'm going to dig into this specific example to make sure it's not a UI issue.

dnsmith124 commented 5 months ago

Looking at the raw result data being returned to the FE, the indicated path in the screenshot has support paths attached to it, so it's marked as inferred. I can confirm it's not an FE bug at least

gprice1129 commented 5 months ago

@sstemann this can happen, but the UI does not support it right now (in the sense that a user could expand an inferred edge in a supporting path).

dnsmith124 commented 5 months ago

Are nested inferred edges something we want to support?

sierra-moxon commented 5 months ago

from TAQA: yes we want to support this in the UI.

use cases - multi-hop MCQ queries (even if you do a 1-hop, you have this output ->has_phenotype->disease group, and supported by indv. edges, but then that is used in another inferred edge in a multi-hop...don't have a choice).

tagging as part of the MCQ project; should be handled in the same time frame (could be handled already in the UI). we need the ability to get to the support graph in the inferred inferred edge, leaving open.

sierra-moxon commented 4 months ago

adding back "needs review" here so I have a simple label to group "things that might need to be fixed before the end of Translator phase 1" for the upcoming relay.

sstemann commented 2 months ago

boy these nested support paths are hard to understand when there are so many that are the same width as the super support path and it seems like the nesting is repeated anytime that same relationship is in a super support path. Maybe its me, but it's overwhelming

In this example, I'm fairly certain that the nested paths for the first nested inferred Glibenclamide Treats Maturity-onset Diabetes Of The Young and the second time Glibenclamide Treats Maturity-onset Diabetes Of The Young are all the same paths.

I'm not sure about releasing it with duplicative/repeating nest support paths.

image

sstemann commented 2 months ago

after discussion with @cartmanbeck I'm tagging this a showstopper.

sstemann commented 2 months ago

i also am tagging Andrew and Mark. BTE is the source. And I could see how a possible solution may be merging.

sierra-moxon commented 2 months ago

From TAQA: for @andrewsu - is the extra inference here a way to get more than 2 hops into the UI? Or is there some sort of collapsing that needs to happen?

From the UI, the filter side bar will be collapsable, so we will have more horizontal space. To support pathfinder, we will also support 4 hops. Horizontal scrolling will also happen, but still, a limitless number of hops will be difficult.

If the evidence is bigger for something - we've gotta support that somehow. this is a good answer actually but we can't show it.

should the "similar to" pop you out to another answer instead of displaying as evidence? - eg. there are two glipizide and glyburide in the result list - can we link between two answers instead of showing the reasoning paths for both independently. [from sierra: there was an agreed answer that many results here are using the same reasoning path and instead of duplicating path support, we should link the primary results(?). hoping @sstemann and @dnsmith124 can characterize their ideas here]

sierra-moxon commented 2 months ago

from TAQA: if this is a showstopper, then we can not move Fugu release to PROD until something is done to reduce the redundancy in the display here -- maybe that is something from BTE, or some sort of limiting in the UI? we would love feedback from @andrewsu's team before deciding.

sstemann commented 2 months ago

from what i can tell - UI rolled back showing sub-sub-paths in Test. Evidence for this is:

https://ui.test.transltr.io/main/results?l=Maturity-onset%20Diabetes%20Of%20The%20Young&i=MONDO:0018911&t=0&r=0&q=69990034-e7ed-49d7-b8cd-6876c22212dc

Expand Result T4 The first path which states Hypothyroidism Phenotype Of Maturity-onset Diabetes Of The Young has EPC BTE on one of the supporting hops

image

In ARAX GUI, the BTE result still seems to have subpaths (two even)

image

so @gprice1129 @Genomewide @andrewsu - should this be moved to Guppy for further review/design/analysis?

andrewsu commented 2 months ago

I was going to review this with our team at our internal meeting tomorrow, so I'll have more info on BTE's behavior after that. But I'll defer to others to decide if the apparent UI change is sufficient to remove this as a showstopper for Fugu...

colleenXu commented 2 months ago

Here's what I've dug up so far...

opening comments: May, test-instance

BTE is not the agent responsible for the "nested inferred" paths. Instead, I suspect Aragorn is. @cbizon

Reasoning 1: BTE's templating system doesn't generate paths with "similar to" or "subject of treatment..." edges. I searched BTE's response and didn't find any "similar to" edges.

Reasoning 2: BTE's data matches paths 2-7, 9. Not the "nested inferred paths"

The UI result corresponds to result 365 in BTE: ![Screen Shot 2024-08-15 at 9 11 59 PM](https://github.com/user-attachments/assets/a5da9113-3e78-478e-83f8-c9b049407983) The top-edge links this support-graph, which corresponds to paths 2-7 in the UI: ![Screen Shot 2024-08-15 at 9 12 51 PM](https://github.com/user-attachments/assets/f2c7f950-48ed-4ec5-b291-fc5f6c3e1a3a) ![Screen Shot 2024-08-15 at 9 14 24 PM](https://github.com/user-attachments/assets/df758afb-6919-4d45-97af-c40e5c26f1fe) The bottom-edge links this support-graph, which corresponds to path 9 in the UI: ![Screen Shot 2024-08-15 at 9 16 17 PM](https://github.com/user-attachments/assets/fcde5aee-8a29-4675-9cd7-3b4aff6791dc) ![Screen Shot 2024-08-15 at 9 17 54 PM](https://github.com/user-attachments/assets/523a0b44-c2e2-4965-ad8e-7a978e2a11d1)

Reasoning 3: suspecting Aragorn

UI says Aragorn + BTE + Service Provider are the agents for this answer ![Screen Shot 2024-08-15 at 9 19 33 PM](https://github.com/user-attachments/assets/1ffc2e1b-11a9-4b1a-8cbf-e251937880d3) **I can't see Aragorn's response in the ARAX-CI UI, but I can look in the JSON.** I did some digging, but not enough to fully recreate the paths seen in the UI.

I find the "similar to" edge seen in some "nested inferred" paths: "7b3d295bd1e8". Its primary source is hetionet, which matches what the UI shows for that edge.

``` { "subject": "CHEMBL.COMPOUND:CHEMBL1481", "object": "PUBCHEM.COMPOUND:3488", "predicate": "biolink:similar_to", "sources": [ { "resource_id": "infores:aragorn", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:automat-hetionet" ] }, { "resource_id": "infores:automat-hetionet", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:hetionet" ] }, { "resource_id": "infores:hetionet", "resource_role": "primary_knowledge_source", "upstream_resource_ids": [] }, { "resource_id": "infores:automat-robokop", "resource_role": "aggregator_knowledge_source", "upstream_resource_ids": [ "infores:hetionet" ] } ], "attributes": [ { "attribute_type_id": "biolink:Attribute", "value": "not_provided", "value_type_id": "EDAM:data_0006", "original_attribute_name": "knowledge_level" }, { "attribute_type_id": "biolink:Attribute", "value": [ "Dice similarity of ECFPs" ], "value_type_id": "EDAM:data_0006", "original_attribute_name": "hetio_source" }, { "attribute_type_id": "biolink:Attribute", "value": "not_provided", "value_type_id": "EDAM:data_0006", "original_attribute_name": "agent_type" } ] } ```

Aragorn's 34th result is glimepiride (CHEMBL.COMPOUND:CHEMBL1481 at that time): take the first edge bound

``` { "node_bindings": { "sn": [ { "id": "CHEMBL.COMPOUND:CHEMBL1481", "attributes": [] } ], "on": [ { "id": "MONDO:0018911", "attributes": [] }, { "id": "MONDO:0010894", "query_id": "MONDO:0018911", "attributes": [] }, { "id": "MONDO:0007452", "query_id": "MONDO:0018911", "attributes": [] } ] }, "analyses": [ { "resource_id": "infores:aragorn", "edge_bindings": { "t_edge": [ { "id": "4afabcbf-aee6-470f-882c-32e1e8d1e653", "attributes": [] }, { "id": "c7d0b5e340c2", "attributes": [] } ] }, "support_graphs": [ "OMNICORP_support_graph_187" ], "score": 0.9275446917362057 } ], "normalized_score": 93.2 } ```

This first edge-bound (4afabcbf-aee6-470f-882c-32e1e8d1e653) has a lot of support-graphs

``` { "subject": "CHEMBL.COMPOUND:CHEMBL1481", "object": "MONDO:0018911", "predicate": "biolink:treats", "sources": [ { "resource_id": "infores:aragorn", "resource_role": "primary_knowledge_source", "upstream_resource_ids": [] } ], "attributes": [ { "attribute_type_id": "biolink:support_graphs", "value": [ "053900de-e5e7-4362-a805-12d5ac44b666", "0b138c2c-22dc-41ed-98d2-ad8a6a3850d6", "ddba8e4a-9960-4b84-8338-200b87ee73de", "4bfa7e4f-42aa-429c-b2c7-0cf2ba61b359", "1d3ff042-ca56-486b-858d-c6afde5857f4", "bf66f118-891b-4bb5-8bb1-47bc51b65bc6", "7f42862c-99a2-4e42-95dc-6d443caf485a", "61ebfe55-7433-4bcb-8a4f-8436dc3b6419", "c04cb08b-a50c-42bc-bc11-f0ab5afccaab", "08894b27-98af-4850-adfe-d132a123dc08", "b0492276-0a83-47e9-b086-334a9cace371", "186545ba-bdde-4b6e-ab5a-c0c4a4e3e4aa", "a13f34d6-3527-4e3a-83dc-48f7afa58b03", "b990b091-58d5-42b0-b1ea-694803148322", "3c60a70c-8f7f-43bf-9e03-6537934e52c5", "c266667b-6fd1-4663-b534-19b059457439" ] }, { "attribute_type_id": "biolink:agent_type", "value": "computational_model", "attribute_source": "infores:aragorn" }, { "attribute_type_id": "biolink:knowledge_level", "value": "prediction", "attribute_source": "infores:aragorn" } ] } ```

Some of those support-graphs include the "similar to" edge `7b3d295bd1e8`

* "1d3ff042-ca56-486b-858d-c6afde5857f4" * "bf66f118-891b-4bb5-8bb1-47bc51b65bc6" * "7f42862c-99a2-4e42-95dc-6d443caf485a"

Links: UI-Test ARAX-CI viewing Aragorn JSON response, downloaded from ARS

colleenXu commented 2 months ago

Sarah's recent comment: 8/13, test-instance

Glimepiride answer is from Aragorn, BTE/Service Provider, Unsecret.

Screen Shot 2024-08-15 at 9 53 58 PM

The nested inferred paths are no longer present. I can also match all of BTE's and Aragorn's data to the UI paths. @cbizon

BTE's data matches almost all paths (not the "3rd to last" or "last" paths)

The UI result corresponds to result 43 in BTE: ![Screen Shot 2024-08-15 at 9 57 52 PM](https://github.com/user-attachments/assets/9f457efb-2205-4410-a4b9-fb19baa21101) The top-edge links this support-graph, which corresponds to many paths in the UI ![Screen Shot 2024-08-15 at 9 58 42 PM](https://github.com/user-attachments/assets/ba76406c-49c0-488f-9a88-35295573eac3) ![Screenshot 2024-08-15 at 21-59-10 Results - NCATS Biomedical Data Translator](https://github.com/user-attachments/assets/63b8a6f0-f653-4e09-98d1-ff48245c446c) The bottom-edge links this support-graph, which corresponds to the "2nd to last" path in the UI: ![Screen Shot 2024-08-15 at 10 03 36 PM](https://github.com/user-attachments/assets/a29b8e47-fbf7-4a44-a137-d65b13f62f2b) ![Screen Shot 2024-08-15 at 10 04 07 PM](https://github.com/user-attachments/assets/0a336904-ce28-4921-b2c2-59cf7821c06a)

Aragorn's data matches the 1st and the "last 3" paths

The UI result corresponds to result 31 in Aragorn: ![Screen Shot 2024-08-15 at 10 08 31 PM](https://github.com/user-attachments/assets/4073cd23-90df-44f6-943e-3d0d7553dd6e) The top-edge links this support-graph, which also corresponds to the "2nd to last" path in the UI: ![Screen Shot 2024-08-15 at 10 09 31 PM](https://github.com/user-attachments/assets/89e5b31c-9ebb-4bff-a729-b7be896d6b21) ![Screen Shot 2024-08-15 at 10 04 07 PM](https://github.com/user-attachments/assets/ac4388eb-3445-4d40-a6e7-a84e97a877bb) The bottom-edge links two support-graphs, which correspond to paths in the UI: ![Screen Shot 2024-08-15 at 10 10 27 PM](https://github.com/user-attachments/assets/e792f4a1-9d59-45e2-87bb-3c8097a8e151) Top support-graph corresponds to the first + last paths in the UI ![Screen Shot 2024-08-15 at 10 10 54 PM](https://github.com/user-attachments/assets/397949cc-715f-4c7a-8280-b23160918853) Bottom support-graph corresponds to the first + 3rd-to-last paths in the UI ![Screen Shot 2024-08-15 at 10 15 29 PM](https://github.com/user-attachments/assets/3908d65e-3aae-4019-b0d4-243d47e5e38c)

Links: UI-Test ARAX-CI viewing


@sstemann I also ran this query in UI-CI earlier today, and had similar paths for Glimepiride

UI-CI ARAX-CI viewing

colleenXu commented 2 months ago

"Duplicate paths" seen in 8/13, test-instance

@sstemann @sierra-moxon

This seems to be happening when BTE's data has two kinds of edges with the same predicate: with a support-graph (for subclassing) and without. The UI doesn't merge those two kinds of edges and it also doesn't show the support-graphs....leading to two "duplicate-looking" edges/paths.

Links: UI-Test ARAX-CI viewing

Example from glimepiride discussion in the previous post

The 4th-5th paths look almost identical in the UI, but the 5th one has a curated symbol on "phenotype of" ![Screen Shot 2024-08-15 at 10 25 33 PM](https://github.com/user-attachments/assets/9c224ac2-decf-4db0-a29f-489644037920) In BTE's response (43rd result, top edge's support-graph), this corresponds to those two paths. Note that there's 3 has_phenotype edges. ![Screen Shot 2024-08-15 at 10 27 23 PM](https://github.com/user-attachments/assets/83d958b3-c82a-4585-8608-789c54e25b05) One edge has support-graphs due to subclassing and its primary source is BTE. This corresponds to the 4th path (phenotype edge's source says BTE): ![Screen Shot 2024-08-15 at 10 40 32 PM](https://github.com/user-attachments/assets/0c6eae2e-9cca-4af9-b656-2382b4ac5a4c) The other two edges have no support-graphs and correspond to the 5th path (curated symbol on "phenotype of") * From hpo-annotations in **MyDisease**: `1cd06fc4e7a942a45a1739ff9a41c641` * from hpo-annotations in **Monarch**: `4ab6a3c78f84ce3ef4a6e246d5fbca51`

sstemann commented 2 months ago

it looks like the current solution is to not show nested support graphs in the UI. which seems fine, but i dont know if there are plans to handle this differently in Guppy @dnsmith124 ?

colleenXu commented 2 months ago

Err...actually the problematic nested support-graphs are no longer in the data/ARA responses (my analysis)...

See my summary in Translator Slack.

dnsmith124 commented 2 months ago

@sstemann we're happy to implement that solution during Guppy!

colleenXu commented 2 months ago

Just noting another issue https://github.com/NCATSTranslator/Feedback/issues/831.

My opinion is that this issue's problem is not a UI problem...and doesn't seem to exist anymore. (Note: people can retest if needed...)

andrewsu commented 1 month ago

Given that the behavior is not observed on either the UI or the ARA/ARS level, I suggest that this issue can be closed...