NCATSTranslator / testing

Materials and tools for testing Translator components
1 stars 9 forks source link

Disease (MONDO:0005737) <subclass_of> Disease #173

Open sstemann opened 2 years ago

sstemann commented 2 years ago

Query: subclass.json PK: 24506c4-b3e4-464e-89e9-eb4e456b9ef9 Control: Query looking for what diseases Ebola is a subclass of. Controls should be things like viral infection, filoviridae, or viral hemorrhagic fever

image

finnagin commented 2 years ago

@MarkDWilliams I made a workflowitized version of this query which overlays ngd values and fisher exact test scores. It then scores the results based on those values and ranks them.

Workflow Query: subclassWorkflow.json ARAX Results

MarkDWilliams commented 2 years ago

Expected result

Subject Predicate Object
MONDO:0005737 biolink:subclass_of MONDO:0018087
MONDO:0005737 biolink:subclass_of MONDO:0005108
MONDO:0005737 biolink:subclass_of MONDO:0005762
edeutsch commented 2 years ago

@cbizon we are doing a little testing to reranking messages from other ARAs and are hitting some snags. We tried reranking the Aragorn results above, but ARAX had trouble with it. The main reason seems be bindings that don't go anywhere. In the Aragorn message: https://arax.ncats.io/api/arax/v1.2/response/99adc6b5-0803-4c52-972e-fe587744a7aa

we see:

    "query_graph": {
      "edges": {
        "e00": {
          "constraints": [],
          "object": "n01",
          "predicates": [
            "biolink:subclass_of"
          ],
          "subject": "n00"
        }
      },
      "nodes": {
        "n00": {
          "categories": [
            "biolink:Disease"
          ],
          "constraints": [],
          "ids": [
            "MONDO:0005737"
          ],
          "is_set": false
        },
        "n01": {
          "categories": [
            "biolink:Disease"
          ],
          "constraints": [],
          "ids": null,
          "is_set": false
        }
      }
    },
    "results": [
...
      {
        "edge_bindings": {
          "e00": [
            {
              "attributes": [
                {
                  "attribute_source": null,
                  "attribute_type_id": "biolink:has_numeric_value",
                  "attributes": null,
                  "description": null,
                  "original_attribute_name": "weight",
                  "value": 0.2679491924311228,
                  "value_type_id": "EDAM:data_1669",
                  "value_url": null
                }
              ],
              "id": "16888608"
            }
          ],
          "s14": [
            {
              "attributes": [
                {
                  "attribute_source": null,
                  "attribute_type_id": "biolink:has_numeric_value",
                  "attributes": null,
                  "description": null,
                  "original_attribute_name": "weight",
                  "value": 0.06359757610223138,
                  "value_type_id": "EDAM:data_1669",
                  "value_url": null
                }
              ],
              "id": "1dfbc4c2-3884-4455-9b55-6e7433e7059b"
            }
          ]
        },

So it seems that a result has an edge binding from a KG edge to s14, which does not exist in the QG? So binds to nothing. Did we ever decide whether that is legal TRAPI? ARAX doesn't do that and got very unhappy when it saw it here. We're working to make ARAX more robust in the face of such things. This will be a wider issue where we try to operate on each others' TRAPI and have different assumptions.

We're fixing ARAX to be tolerant of this, but this might cause problems for others as well.

Perhaps this is an issue that we should face head on: is the above semantically valid TRAPI? or should a semantic validator of TRAPI (which we don't have today but should have) flag that as an error?

edeutsch commented 2 years ago

Current TRAPI documentation would seem to imply that all edge_binding keys must resolve in the QueryGraph:

        edge_bindings:
          type: object
          description: >-
            The dictionary of Input Query Graph to Result Knowledge Graph edge
            bindings where the dictionary keys are the key identifiers of the
            Query Graph edges and the associated values of those keys are
            instances of EdgeBinding schema type (see below). This value is an
            array of EdgeBindings since a given query edge may resolve to
            multiple knowledge graph edges in the result.
edeutsch commented 2 years ago

Here's a different question related to this test. Maybe we should all be on the same page on this.

Are all entities always a subclass of themselves?

I thought I heard it stated once that this was on a stone tablet? And Aragorn's results state it. But it is not in Mark's list.

cbizon commented 2 years ago

Yeah, pretty sure that according to the data modeling group, A is a subclass of A.

In terms of Mark's list, I don't think it's meant to be exhaustive, but a list of what a particular educated user might expect to be highly ranked answers.

edeutsch commented 2 years ago

But isn't the list what we intend to test against with your testing framework? If so, the scoring on the test should be quite different on whether the self subclass is there. With the current list, Aragorn should be penalized substantially because its number one answer (Ebola itself) is not in the list. If we were to add Ebola itself to the list, then ARAX would be penalized because the number one answer (Ebola itself) is nowhere in our results.

cbizon commented 2 years ago

Yes the goal is to capture some good answers to build benchmarks. But (IMO) each benchmark represents one view of what makes a good answer, and a different benchmark might emphasize different qualities.

This set of benchmarks is driven by what a user might expect for simple one hops. I would argue (based on this particular result and the discussion on the call) that no user particularly wants to see the self-subclass result, even if it's useful at the modeling level in maintaining logical consistency.

In other words, I think it's a useful datapoint for ARAGORN to know that it's returning an uninteresting answer (defined in one way) to a user.

edeutsch commented 2 years ago

Sounds good to me. I would agree that is the best way forward.

So it seems like the proposed formal policy is: From a modeling perspective every concept is also a subclass of itself, but when users of Translator request items with a subclass relationship, they should not be shown the self class as a result.

Should we document that?