ranking-agent / aragorn

A Translator ARA combining asynchronous database querying, answer coalescence, and answer ranking.
MIT License
4 stars 3 forks source link

Creative query not returning any results #269

Open maximusunc opened 1 month ago

maximusunc commented 1 month ago

Given the following query:

{
  "message": {
    "query_graph": {
      "edges": {
        "t_edge": {
          "attribute_constraints": [],
          "knowledge_type": "inferred",
          "object": "on",
          "predicates": ["biolink:treats"],
          "subject": "sn"
        }
      },
      "nodes": {
        "on": {
          "categories": ["biolink:Disease"],
          "ids": ["MONDO:0100320"]
        },
        "sn": {
          "categories": ["biolink:ChemicalEntity"]
        }
      }
    }
  },
  "parameters": {
    "overwrite_cache": true,
    "timeout_seconds": 360,
    "kp_timeout": 360
  }
}

No results are returned. We expect results. Looking at the logs, it looks like there's an error happening in Strider and every subquery is failing.

maximusunc commented 1 month ago

So here's what's happening: MONDO:0100320 is "post-COVID-19 disorder", and node normalizer knows about it, but there are no other equivalent identifiers. When sent to all KPs, Automat-Robokop is the only KP that responds, with 52 results. No other KPs give any results. Looking at the results from Robokop, it returns only subclass_of nodes: "MONDO:0100233" = "long COVID-19" "MONDO:0100319" = "COVID-19–associated multisystem inflammatory syndrome in adults" "MONDO:0100163" = "COVID-19–associated multisystem inflammatory syndrome in children"

It doesn't actually return MONDO:0100320, so that node is not in the knowledge_graph, and we error out with no results.

I'm not sure is Strider just needs to handle this case or this is an issue with Robokop not returning the node? @cbizon do you have any input?

cbizon commented 1 month ago

Hmm that's a good find. In the version of subclass reasoning that we're employing here, where the subclass node is bound directly to the query node, we don't require the parent node to be returned in the KG. It usually is, because there are also usually answers related to the parent node.

We could have plater handle this, but we would not be guaranteed that other KPs would, so strider should still probably watch out for it.

maximusunc commented 1 month ago

How should Strider get this parent node if it's not returned by any KPs? I believe if we just passed this along as is, it would get flagged as being invalid upstream too.