NCATSTranslator / testing

Materials and tools for testing Translator components
1 stars 9 forks source link

Comparison and exclusion templates #227

Closed jh111 closed 8 months ago

jh111 commented 1 year ago

Translator cannot directly compare two answer sets to identify overlap and distinction, such the as the use case in QotM five. For example: genes associated with to Psoriatic Arthritis, distinct from genes for Psoriasis’. This missing capability prompted suggestions for post-relay templated questions:

Simple presence/absence would be useful, but a more sophisticated comparison might also consider relative ranking. For example, a gene may be weakly strong associated with Psoriasis, but strongly associated with Psoriatic Arthritis.

karafecho commented 1 year ago

I'll add that this issue has surfaced in more than one QotM challenge, as well as in other efforts (e.g., one branch of the CDC get_creative() workflow).

dkoslicki commented 1 year ago

@karafecho @jh111 This functionality actually does exist! Here are two ways to do it (look for genes connected to Psoriatic Arthritis but not Psoriasis):

TRAPI:

{
  "workflow": [],
  "message": {
    "query_graph": {
      "edges": {
        "e0": {
          "subject": "n0",
          "object": "n1"
        },
        "e1": {
          "subject": "n2",
          "object": "n1",
          "exclude": true   #<<---------------- This is the key part
        }
      },
      "nodes": {
        "n0": {
          "ids": [
            "MONDO:0011849"
          ],
          "is_set": false,
          "name": "MONDO:0011849"
        },
        "n1": {
          "is_set": false,
          "categories": [
            "biolink:Gene"
          ]
        },
        "n2": {
          "ids": [
            "MONDO:0005083"
          ],
          "is_set": false,
          "name": "MONDO:0005083"
        }
      }
    }
  }
}

ARAXi (wrapped in TRAPI):

{
  "message": {},
  "operations": {
    "actions": [
      "# This program creates two query nodes and a query edge between them, looks for matching edges in the KG,",
      "# overlays NGD metrics, and returns the top 30 results",
      "add_qnode(ids=MONDO:0011849, key=n0)",
      "add_qnode(categories=biolink:Gene, key=n1)",
      "add_qnode(ids=MONDO:0005083, key=n2)",
      "add_qedge(subject=n0, object=n1, key=e0)",
      "add_qedge(subject=n2, object=n1, key=e1, exclude=true)",
      "expand()",
      "resultify()",
      "filter_results(action=limit_number_of_results, max_results=30)",
      ""
    ]
  },
  "submitter": "ARAX GUI",
  "stream_progress": true,
  "query_options": {
    "kp_timeout": "30",
    "prune_threshold": "50"
  }
}

Note: I've only tested this on ARAX, as the ARS appears to be down at the moment. But at the very least, ARAX will respect the exclude tag when it's sent through the ARS (or directly to ARAX).

karafecho commented 1 year ago

Wow, this is great! You are full of surprises, @dkoslicki.

Let's leave this ticket open for a while, as I'm pretty sure others are likewise unaware of this functionality. In fact, @cbizon, @Genomewide (Andy C.), and I briefly discussed the issue during today's TACT call. I will leave a link to this ticket in the QotM summary that will be published in the August Gazette, so hopefully, folks will see it.

Thank you!

colleenXu commented 1 year ago

note that "exclude" doesn't seem to be in the TRAPI spec in the master branch or 1.3 branch (perhaps it falls under additionalProperties).

BTE currently won't flag "exclude" as something it doesn't understand.

edeutsch commented 1 year ago

Yes, "exclude" is not formally part of the schema. It was proposed and there is a PR for it: https://github.com/NCATSTranslator/ReasonerAPI/pull/306 but there was enough concern about it that we haven't merged it yet. If I recall correctly, some people wanted a detailed implementation document first, since implementation could be tricky.

So, at the moment, only ARAX supports this functionality. Others can support it too, of course, since additionalProperties is permitted here. if there is a groundswell of support for it, please make it known so that we can prioritize it for TRAPI 1.4.

cbizon commented 1 year ago

I understand that there's a way to do this in non-standard trapi that we could make standard. But is there an operations approach that we should implement instead similar to what the ARAX calls look like?

dkoslicki commented 1 year ago

I tried a fair bit yesterday with filtering, and couldn't quite get it to work with Operations and Workflow. So implementing a new operation is the way to go I think

dkoslicki commented 1 year ago

As an update to this issue, see https://github.com/NCATSTranslator/Feedback/issues/55, specifically:

Per the operations and workflow (O/W) meeting Jan. 10th, O/W will not be implementing this functionality. The ability to exclude things is on the TRAPI radar per https://github.com/NCATSTranslator/ReasonerAPI/pull/306 by @edeutsch. Other light-weight set operations (union, intersection, difference of result sets) will be handled at the UI level and/or user level after UI data export is operational.

sstemann commented 8 months ago

@jh111 should this go to TAQA or TACT?

karafecho commented 8 months ago

@sstemann : I think the UI team has taken ownership of this issue under this ticket https://github.com/NCATSTranslator/Feedback/issues/55. As such, I am closing the ticket here.