ranking-agent / aragorn

A Translator ARA combining asynchronous database querying, answer coalescence, and answer ranking.
MIT License
4 stars 3 forks source link

Cannot run direct three-hop ARAGORN queries for Workflow B #26

Open karafecho opened 3 years ago

karafecho commented 3 years ago

This issue is to report that both @xu-hao and I cannot run direct three-hop ARAGORN queries for Workflow B. I initially thought the error was on my end, but if Hao is encountering issues, then I think there's something not quite right on the ARAGORN side.

The TRAPI query can be found here. Note that Hao tested both e01 biolink:has_real_world_evidence_of_association_with and e01 biolink_correlated_with. I only tested the latter predicate, as that's the one I used when testing direct three-hop ARAX queries.

Here's the command:

curl -XPOST https://aragorn.renci.org/1.1/query -d '{                                                                                 
                               "message": {
                                   "query_graph": {
                                       "nodes": {
                                           "n0": {
                                                "ids": ["MESH:D056487"],
                                                "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                                           },
                                           "n1": {
                                               "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                                           },
                                           "n2": {
                                               "categories": ["biolink:Gene"]
                                           },
                                           "n3": {
                                               "categories": ["biolink:ChemicalEntity"]
                                           }
                                       },
                                       "edges": {
                                           "e01": {
                                               "subject": "n0",
                                               "object": "n1",
                                               "predicates": ["biolink:correlated_with"]
                                           },
                                           "e02": {
                                               "subject": "n2",
                                               "object": "n1",
                                               "predicates": ["biolink:gene_associated_with_condition"]
                                           },
                                           "e03": {
                                               "subject": "n2",
                                               "object": "n3",
                                               "predicates": ["biolink:related_to"]
                                           }
                                       }
                                   }
                               }
                           }' -H "Content-Type: application/json"

Here's the error message that Hao received from e01 biolink:has_real_world_evidence_of_association_with:

{"message":{"query_graph":{"nodes":{"n0":{"ids":["MESH:D056487"],"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n1":{"ids":null,"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n2":{"ids":null,"categories":["biolink:Gene"],"is_set":false,"constraints":null},"n3":{"ids":null,"categories":["biolink:ChemicalEntity"],"is_set":false,"constraints":null}},"edges":{"e01":{"subject":"n0","object":"n1","predicates":["biolink:has_real_world_evidence_of_association_with"],"relation":null,"constraints":null},"e02":{"subject":"n2","object":"n1","predicates":["biolink:gene_associated_with_condition"],"relation":null,"constraints":null},"e03":{"subject":"n2","object":"n3","predicates":["biolink:related_to"],"relation":null,"constraints":null}}},"knowledge_graph":{"nodes":{},"edges":{}},"results":[]},"logs":[{"timestamp":"2021-08-24T22:13:34.599883","level":"WARNING","code":null,"message":"warning: empty returned"},{"timestamp":"2021-08-24T22:13:34.648997","level":"ERROR","code":null,"message":"No results to coalesce"},{"timestamp":"2021-08-24T22:13:34.653174","level":"ERROR","code":null,"message":"answer_coalesce error: HTML error status code 422 returned."},{"timestamp":"2021-08-24T22:13:34.785124","level":"WARNING","code":null,"message":"warning: empty returned"},{"timestamp":"2021-08-24T22:13:34.836039","level":"WARNING","code":null,"message":"warning: empty returned"},{"timestamp":"2021-08-24T22:13:34.882219","level":"WARNING","code":null,"message":"warning: empty returned"}],"status":null,"workflow":["lookup","enrich_results","connect_knodes","score"]}

And here's the error message that was returned with e01 biolink:correlated_with:

{"message":{"query_graph":{"nodes":{"n0":{"ids":["MESH:D056487"],"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n1":{"ids":null,"categories":["biolink:DiseaseOrPhenotypicFeature"],"is_set":false,"constraints":null},"n2":{"ids":null,"categories":["biolink:Gene"],"is_set":false,"constraints":null},"n3":{"ids":null,"categories":["biolink:ChemicalEntity"],"is_set":false,"constraints":null}},"edges":{"e01":{"subject":"n0","object":"n1","predicates":["biolink:correlated_with"],"relation":null,"constraints":null},"e02":{"subject":"n2","object":"n1","predicates":["biolink:gene_associated_with_condition"],"relation":null,"constraints":null},"e03":{"subject":"n2","object":"n3","predicates":["biolink:related_to"],"relation":null,"constraints":null}}},"knowledge_graph":null,"results":null},"logs":[{"timestamp":"2021-08-24 22:36:29.878473","level":"ERROR","message":"Exception 'logs'","code":null}],"status":null}

Any chance you all can work on this query and send me/Hao both the executable query and the associated JSON output, so that Hao and I can figure out what we did wrong and (importantly) I can review the answers? I honestly think this might be the more efficient testing approach.

karafecho commented 3 years ago

Apparently, I cannot assign anyone to this issue, so I'll suggest @cbizon and @patrickwang.

cbizon commented 3 years ago

A few things:

With real_world_whatever, what you are getting is 0 responses. That's because (I think?) KPs are not exposing this predicate in their meta_knowledge_graph endpoints. At least when I just checked ICEES, I didn't see it. Unless it's in there, we have no way to run the query.

With correlated_with, strider is putting together about 13000 answers, so that seems to be working fine, but aragorn is causing a 500. So that part is not something you're doing, it's our bug to fix.

cbizon commented 3 years ago

I've verified that this is a bug in AC somewhere. I think it's a botched biolink issue. But anyway, you can send a workflow to aragorn to bypass AC for the moment. So you can make your query look like:

{
"message": messagestuff,
"workflow": ["lookup","connect_knodes","score"]
}

I've verified that this works when sent to aragorn.

karafecho commented 3 years ago

WRT biolink:has_real_evidence_of_association_with, I don't think the meta-KG is the issue. Specifically, the ICEES DILI instance returns results to this query when run directly against ARAX: https://arax.ncats.io/?r=18599. Hao and Patrick fixed the meta-KG issue, which was actually with the ICEES asthma instance, not the ICEES DILI instance, but when Hao later tested the query by running it directly against ARAGORN, it returned no results. So, I think something else is going on.

The workflow trick worked, however, so that much is good.

cbizon commented 3 years ago

Tried just running the 1-hop directly on ICEES DILI, and got 62 results. Same query through strider, 0 results. It looks like the KP is registered, and has the has_real_evidence... predicate. Furthermore, strider reports that it is calling ICEES DILI, but not getting any results back. Also strange, it appears to be calling many KPs that I don't think should have this predicate, like covidkop.

Assigning @patrickkwang to pick up the strider thread.

cbizon commented 3 years ago

This was the query:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": [
                        "MESH:D056487"
                    ],
                    "categories": [
                        "biolink:DiseaseOrPhenotypicFeature"
                    ]
                },
                "n1": {
                    "categories": [
                        "biolink:DiseaseOrPhenotypicFeature"
                    ]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": [
                        "biolink:has_real_world_evidence_of_association_with"
                    ]
                }
            }
        }
    }
}