biothings / biothings_explorer

TRAPI service for BioThings Explorer
https://explorer.biothings.io
Apache License 2.0
10 stars 11 forks source link

combine edges in results for multi-hop queries #164

Closed andrewsu closed 3 years ago

andrewsu commented 3 years ago

https://github.com/NCATSTranslator/testing/issues/55 issues a multi-hop query. In the BTE results, we have separate entries in the results for each edge (e01 and e02 in the screenshot 1 below), rather than a single entry in results that for each path that has both edges. Compare that to the Improving Agent output in screenshot 2 below, which I think is how TRAPI intends to capture results.

Screenshot 1
image
Screenshot 2
image
newgene commented 3 years ago

Should be implemented here: https://github.com/biothings/bte_trapi_query_graph_handler/blob/main/src/query_results.js#L39

ariutta commented 3 years ago

I'm currently getting the following, which doesn't match up with either of the outputs above:

{
  "message": {
    "query_graph": {
      "nodes": {
        "n0": {
          "id": "MONDO:0005812",
          "category": "biolink:Disease"
        },
        "n1": {
          "category": "biolink:PhenotypicFeature"
        },
        "n2": {
          "category": "biolink:Disease"
        }
      },
      "edges": {
        "e01": {
          "subject": "n0",
          "object": "n1"
        },
        "e02": {
          "subject": "n1",
          "object": "n2"
        }
      }
    },
    "knowledge_graph": {
      "nodes": {},
      "edges": {}
    },
    "results": []
  },
  "logs": [
    {
      "timestamp": "2021-06-15T00:16:56.980Z",
      "level": "DEBUG",
      "message": "BTE identified 3 QNodes from your query graph",
      "code": null
    },
    {
      "timestamp": "2021-06-15T00:16:56.980Z",
      "level": "DEBUG",
      "message": "BTE identified 2 QEdges from your query graph",
      "code": null
    },
    {
      "timestamp": "2021-06-15T00:16:56.980Z",
      "level": "DEBUG",
      "message": "BTE identified your query graph as a 1-depth query graph",
      "code": null
    },
    {
      "timestamp": "2021-06-15T00:16:56.981Z",
      "level": "DEBUG",
      "message": "REDIS cache is not enabled.",
      "code": null
    }
  ]
}
ariutta commented 3 years ago

Maybe related, I'm getting some error messages when I try running the query for "examples/v1.1/query_multihop_disease_gene_chemical.json":

biothings-explorer-trapi:batch_edge_query Start to query BTEEdges.... +17ms biothings-explorer-trapi:batch_edge_query BTEEdges are successfully queried.... +1m biothings-explorer-trapi:batch_edge_query Failed to filter 27279 results due to TypeError: Cannot read property 'map biothings-explorer-trapi:batch_edge_query Total number of response is 27279 +0ms biothings-explorer-trapi:batch_edge_query Start to update nodes,hi. +0ms biothings-explorer-trapi:batch_edge_query update nodes completed +39ms biothings-explorer-trapi:main Query for depth 2 completes. +1m biothings-explorer-trapi:main Start to notify subscribers now. +0ms biothings-explorer-trapi:QueryResult Updating query results now! +1m biothings-explorer-trapi:Graph Updating BTE Graph now. +1m biothings-explorer-trapi:main Updated TRAPI knowledge graph using query results for depth 2 +526ms biothings-explorer-trapi:cron Updating local copy of SmartAPI specs now at Tue, 15 Jun 2021 00:30:00 GMT! +0ms biothings-explorer-trapi:cron Lining up 19 items to get predicates from +3s biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://spokekp.healthdatascience.cloud/api biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://ia.healthdatascience.cloud/api/v1.1 biothings-explorer-trapi:cron [error]: API "RTX KG2" failed to get /predicates for https://arax.ncats.io/api/rtxkg2/ tatus code 500 +29ms biothings-explorer-trapi:cron [error]: API "RTX KG2" failed to get /meta_knowledge_graph for https://arax.ncats.io/a led with status code 500 +1ms biothings-explorer-trapi:cron [error]: API "OpenAPI for NCATS Biomedical Translator Reasoners" failed to get /meta_k 2 due to error Error: Request failed with status code 404 +196ms biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://cohd.io/api/ +2ms biothings-explorer-trapi:cron Successfully got /predicates for https://covid.cohd.io/api/ +1ms biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://icees.renci.org:16339 +19ms biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://icees.renci.org:16341 +1ms biothings-explorer-trapi:cron Successfully got /predicates for http://chp.thayer.dartmouth.edu/ +24ms biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://translator.broadinstitute.org/molep biothings-explorer-trapi:cron Successfully got /predicates for https://translator.broadinstitute.org/molepro/trapi/v biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://translator.broadinstitute.org/genet biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://openpredict.semanticscience.org +11 biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://stars-app.renci.org/sparql-kp +201m biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://api.collaboratory.semanticscience.o biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://cam-kp-api-dev.renci.org +446ms biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://arax.ncats.io/api/arax/v1.1 +1s biothings-explorer-trapi:cron Successfully got /meta_knowledge_graph for https://explanatory-agent.azurewebsites.net biothings-explorer-trapi:cron Got 16 successful requests +2ms biothings-explorer-trapi:cron Successfully updated the local copy of SmartAPI specs. +3ms

colleenXu commented 3 years ago

First, your query is not correct (not TRAPI v1.1). Second, there are known issues with BTE and one of its KP APIs right now. To avoid hitting it, I would do a query like this. This two hop takes my local instance ~32 seconds to return and the response is 20.31 MB.

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Gene"],
                    "ids": ["NCBIGene:3778"]
                },
                "n1": {
                    "categories": ["biolink:Gene"]
                },
                "n2": {
                    "categories": ["biolink:ChemicalSubstance"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                },
                "e02": {
                    "subject": "n1",
                    "object": "n2"
                }
            }
        }
    }
}
ariutta commented 3 years ago

@colleenXu, I added your query as a new example file: https://github.com/biothings/BioThings_Explorer_TRAPI/blob/fix_issue_164/examples/v1.1/query_multihop_gene_gene_chemical.json

I also added a test (currently failing) for the desired result for this issue. @newgene (and anyone else), can I get a confirmation that this test correctly describes the desired result? https://github.com/biothings/BioThings_Explorer_TRAPI/blob/646c18c3a1c0dbac3e25612864b1a17f930c9be1/__test__/integration/routes/v1query.test.js#L63

If the test is good, I'll refactor the code to make it pass.

ariutta commented 3 years ago

Feel free to suggest modifications to the test, if you think it should be more tightly defined or be located in a different file.

andrewsu commented 3 years ago

Yes, I think that test accurately describes the desired result. If you want to be even more explicit before writing code, you could create two example outputs, one which fails the test and a reformatted version that passes. But not necessary -- only if useful for you.

colleenXu commented 3 years ago

@ariutta Is there a way to make the test more detailed? For example, given this query, this entire result object should exist in the results? I give an example below.

@andrewsu Please advise, since I think this is also desired behavior that we need a test for. I think the TRAPI standard says incomplete paths aren't in the results. This would be a change to BTE's current behavior.

For example, in the example below, KCNMA1 (NCBIGene:3778) -> pulmonary hypertension (MONDO:0005149) exists as an e0 edge....but there's no matching e1 edges for the second hop (pulmonary hypertension -> other genes) (I think there's no matching edges because of the predicate restriction). There should therefore be no results object with pulmonary hypertension as its n1, since there's no complete path that includes it.


For example, this is a query I ran that makes more biological sense (take the gene KCNMA1, find diseases that are caused by it, look at what other genes cause those diseases). It takes my local instance of BTE 1 min 15 sec - 1 min 40 sec to run, and the size of the response is 4.22 MB. If it's different for you, then something is likely different in the data, so the result object below might not be there.

{
    "message": {
        "query_graph": {
            "nodes": {
                 "n0": {
                    "ids": ["HGNC:6284"],
                    "categories":["biolink:Gene"]
                },
                "n1": {
                    "categories": ["biolink:Disease"]
                },
                "n2": {
                    "categories": ["biolink:Gene"]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:gene_associated_with_condition"]
                },
                "e1": {
                    "subject": "n1",
                    "object": "n2",
                    "predicates": ["biolink:condition_associated_with_gene"]
                }
            }
        }
    }
}

Based on the nodes/edges in the Response object, I would expect the following object (1 result) somewhere in the results array. In human-readable terms, it says KCNMA -> obesity disorder (disease) -> TULP3 (gene)

            {
                "node_bindings": {
                    "n0": [
                        {
                            "id": "NCBIGene:3778"
                        }
                    ],
                    "n1": [
                        {
                            "id": "MONDO:0011122"
                        }
                    ],
                    "n2": [
                        {
                            "id": "NCBIGene:7289"
                        }
                    ]
                },
                "edge_bindings": {
                    "e0": [
                        {
                            "id": "NCBIGene:3778-biolink:gene_associated_with_condition-MONDO:0011122"
                        }
                    ],
                    "e1": [
                        {
                            "id": "MONDO:0011122-biolink:condition_associated_with_gene-NCBIGene:7289"
                        }
                    ],
                },
                "score": "1.0"
            },
andrewsu commented 3 years ago

@colleenXu I think your proposed test is good but much more brittle, i.e., it's dependent on data resources continuing to return the same data. So, let's consider this in the future. But for the moment, I think the simpler test that Anders proposed should move us substantially toward fixing this issue. Let's reassess after this test passes.

colleenXu commented 3 years ago

After discussion with Andrew today, the test could have a simple addition:

The node bindings would have n0, n1, and n2, for the specified query.

colleenXu commented 3 years ago

Note for @andrewsu and me:

ariutta commented 3 years ago

@colleenXu, as @andrewsu said, the test you suggested would be brittle for this repo, but it might actually be perfect for QueryResult.test.js in the bte_trapi_query_graph_handler repo. The difference is that we have control over the records in QueryResult.test.js because we're specifying them manually, not getting them from an API every time.

What do you and @andrewsu think of adding your test there? I got the test started, but it needs some updating to match what you suggested (I just duplicated the record from Kevin's test): https://github.com/biothings/bte_trapi_query_graph_handler/blob/fix_issue_164/__test__/integration/QueryResult.test.js

I think this would help to make sure we get what is desired.

ariutta commented 3 years ago

@andrewsu, I have a rough WIP going on here: https://github.com/biothings/bte_trapi_query_graph_handler/commit/97963b6f3379de4f1ed7ff8e5bedbcd303f34271

The basic idea is that if we want QueryResult.update() to only have queryResult as input, we'll need to do some kind of mapping from previous output result index to node ID. I think my current code fails because it assumes there's no expansion between hops, so we might need to create a new array to serve as the next this.results, merge the hop from each record with a deep copy of its corresponding item in this.results to add to the new array and at the end of the update call, set this.results equal to the new array.

andrewsu commented 3 years ago

reassigned this to Colleen to do some confirmation testing that the behavior is as expected

colleenXu commented 3 years ago

[EDIT: Anders's work hasn't been pushed to the npm release yet, which is why my current environment and the public BTE instance haven't changed yet] Strangely, I don't see the expected "fix" and the behavior looks the same as before.


This is what I did for a quick test:

Because there are edges from the input ID CHEBI:41423 -> NCBIGene:5170 (PDPK1) and NCBIGene:5170 -> CHEBI:91439 (BX-795), I expect something like this in the results section:

            {
                "node_bindings": {
                    "n0": [
                        {
                            "id": "CHEBI:41423"
                        }
                    ],
                    "n1": [
                        {
                            "id": "NCBIGene:5170"
                        }
                    ],
                    "n2": [
                        {
                            "id": "CHEBI:91439"
                        }
                    ]
                },
                "edge_bindings": {
                    "e0": [
                        {
                            "id": "CHEBI:41423-biolink:physically_interacts_with-NCBIGene:5170"
                        }
                    ],
                    "e1": [
                        {
                            "id": "NCBIGene:5170-biolink:physically_interacts_with-CHEBI:91439"
                        }
                    ]
                },
                "score": "1.0"
            },

Instead, I see these two objects in the results section:

            {
                "node_bindings": {
                    "n0": [
                        {
                            "id": "CHEBI:41423"
                        }
                    ],
                    "n1": [
                        {
                            "id": "NCBIGene:5170"
                        }
                    ]
                },
                "edge_bindings": {
                    "e0": [
                        {
                            "id": "CHEBI:41423-biolink:physically_interacts_with-NCBIGene:5170"
                        }
                    ]
                },
                "score": "1.0"
            },
            {
                "node_bindings": {
                    "n1": [
                        {
                            "id": "NCBIGene:5170"
                        }
                    ],
                    "n2": [
                        {
                            "id": "CHEBI:91439"
                        }
                    ]
                },
                "edge_bindings": {
                    "e1": [
                        {
                            "id": "NCBIGene:5170-biolink:physically_interacts_with-CHEBI:91439"
                        }
                    ]
                },
                "score": "1.0"
            },

The query:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["PUBCHEM.COMPOUND:2662"],
                    "categories":["biolink:ChemicalSubstance"]
                },
                "n1": {
                    "categories":["biolink:Gene"]
               },
                "n2": {
                    "categories":["biolink:ChemicalSubstance"]
               }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1"
                },
                "e1": {
                    "subject": "n1",
                    "object": "n2"
                }
            }
        }
    }
}
andrewsu commented 3 years ago

From 2021-07-13 Translator: meeting

ariutta commented 3 years ago

Missing Score

Old code - automatically always sets it as '1.0' here: https://github.com/biothings/bte_trapi_query_graph_handler/blob/87878936ce00b4bd3893243e65e0bdeb778660a2/src/query_results.js#L45

New code - maybe add "score": "1.0" here: https://github.com/biothings/bte_trapi_query_graph_handler/blob/7ec30dcfc14a1e2a822aedd3d54128a90a376b7a/src/query_results.js#L74

ariutta commented 3 years ago

@newgene, before we close this issue, do you want to delete _createEdgeBindings and _createNodeBindings? I left them to be conservative with my change, but they aren't used anywhere except the test, so leaving them might actually be confusing.

https://github.com/biothings/bte_trapi_query_graph_handler/blob/7ec30dcfc14a1e2a822aedd3d54128a90a376b7a/src/query_results.js#L86

colleenXu commented 3 years ago

Performance with Anders's code

TDLR: very similar to before. I did 4 tests below, with small-scale multi-hop queries... dev has Anders's multi-hop code, doesn't have scoring (BUG) prod doesn't have multi-hop code, does have scoring testing involves running 1 query for prod, then waiting several min before running that query on dev (or vice versa).


test 1

Query 1 (see section below), to BTE-MyChem only (dev url, prod url).

Dev: 9.47 sec to return 4.73 MB response. Prod: 9.5 sec to return 3.9 MB response

test 2

Query 1 (see section below), to BTE-DGIdb only (dev url, prod url).

Dev: 16.81 sec to return 11.48 MB response. Prod: 16.24 sec to return 10.07 MB response

test 3

Query 2 (see section below), to BTE (dev url, prod url).

Dev: 34.66 sec to return 31.16 MB response. Prod: 36.55 sec to return 25.37 MB response

test 4

Query 3 (see section below), to BTE (dev url, prod url).

Dev: 9.53 s sec to return 4.97 MB response. Prod: 8.98 sec to return 4.7 MB response


Query 1:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids":["PUBCHEM.COMPOUND:2662"],
                    "categories":["biolink:ChemicalSubstance"]
                },
                "n1": {
                    "categories":["biolink:Gene"]
               },
                "n2": {
                    "categories":["biolink:ChemicalSubstance"]
               }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:physically_interacts_with"]
                },
                "e1": {
                    "subject": "n1",
                    "object": "n2",
                    "predicates": ["biolink:physically_interacts_with"]
                }
            }
        }
    }
}

Query 2:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Gene"],
                    "ids": ["NCBIGene:3778"]
                },
                "n1": {
                    "categories": ["biolink:Gene"]
                },
                "n2": {
                    "categories": ["biolink:ChemicalSubstance"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                },
                "e02": {
                    "subject": "n1",
                    "object": "n2"
                }
            }
        }
    }
}

Query 3:

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Pathway"],
                    "ids": ["REACT:R-HSA-1368082"]
                },
                "n1": {
                    "categories": ["biolink:Gene"]
                },
                "n2": {
                    "categories": ["biolink:ChemicalSubstance"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                },
                "e02": {
                    "subject": "n1",
                    "object": "n2"
                }
            }
        }
    }
}
ariutta commented 3 years ago

I added score, removed unused methods _createEdgeBindings and _createNodeBindings and updated tests to reflect these changes.

Based on @colleenXu 's results, it appears there's no significant change to performance.

As far as I can see, this issue is ready to be closed.

colleenXu commented 3 years ago

I can confirm by sending queries to the dev instance of BTE, that scores are now appearing in the correct parts of the result object for single and multi-hop Predict-style responses.

Closing this issue

andrewsu commented 3 years ago

reopening this issue because I think I'm seeing the same (or a related) issue. Here is the query (three-node, two edges):

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": [
                        "PUBCHEM.COMPOUND:5358"
                    ],
                    "categories": [
                        "biolink:ChemicalSubstance"
                    ]
                },
                "n1": {
                    "categories": [
                        "biolink:BiologicalProcessOrActivity"
                    ]
                },
                "n2": {
                    "categories": [
                        "biolink:Disease"
                    ]
                }
            },
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": [
                        "biolink:related_to"
                    ]
                },
                "e1": {
                    "subject": "n2",
                    "object": "n1",
                    "predicates": [
                        "biolink:related_to"
                    ]
                }
            }
        }
    }
}

The results objects only have nodes for n0 and n1, and only have an edge for e0. image

@ariutta can you have a look please?

ariutta commented 3 years ago

@andrewsu, I created a draft version of a test for the case you described above: https://github.com/biothings/bte_trapi_query_graph_handler/blob/fix_issue_164a/__test__/integration/QueryResult.test.js#L333

The records have some placeholder values (see please_fix), but I expected this test to still correctly match the real example of what you saw. However, it doesn't, because it currently returns 0 results, and I'm not sure why.

Should there be two queryResult.update calls (not a single call with two records)? https://github.com/biothings/bte_trapi_query_graph_handler/blob/fix_issue_164a/__test__/integration/QueryResult.test.js#L422

ariutta commented 3 years ago

In the two results you got, I notice n1 and e0 are different for each result, but both of the n1 values appear to be in the biolink:BiologicalProcessOrActivity category (Acclimatization and Cerebrovascular Circulation).

andrewsu commented 3 years ago

@andrewsu, I created a draft version of a test for the case you described above: https://github.com/biothings/bte_trapi_query_graph_handler/blob/fix_issue_164a/__test__/integration/QueryResult.test.js#L333

The records have some placeholder values (see please_fix), but I expected this test to still correctly match the real example of what you saw. However, it doesn't, because it currently returns 0 results, and I'm not sure why.

Can you confirm you see the results above when POSTing the query to https://api.bte.ncats.io/v1/query? I confirm that I get zero results when posting to dev (which I thought was due to the Biolink Model v2.1 update, but changing ChemicalSubstance to SmallMolecule doesn't fix it, so I think I'm missing something here...)

Should there be two queryResult.update calls (not a single call with two records)? https://github.com/biothings/bte_trapi_query_graph_handler/blob/fix_issue_164a/__test__/integration/QueryResult.test.js#L422

Hmm, that's a bit too deep into the code for me to have an informed answer on...

In the two results you got, I notice n1 and e0 are different for each result, but both of the n1 values appear to be in the biolink:BiologicalProcessOrActivity category (Acclimatization and Cerebrovascular Circulation).

Yes, from my view, it looks like only e0 is being reported and e1 is being dropped.

ariutta commented 3 years ago

I got a response from https://api.bte.ncats.io/v1/query. The value for results is []. I notice in the logs, there are several ERRORs, e.g.: "message": "call-apis: Failed to make to following query: {\"url\":\"https://automat.renci.org/cord19-scigraph/query\",\"data\":{\"message\":{\"query_graph\":{\"nodes\":{\"n0\":{\"ids\":[\"P UBCHEM.COMPOUND:5358\"],\"categories\":[\"biolink:ChemicalSubstance\"]},\"n1\":{\"categories\":[\"biolink:BiologicalProcess\"]}},\"edges\":{\"e01\":{\"subject\":\"n0\",\"object\":\"n1\",\"predicates \":[\"biolink:related_to\"]}}}}},\"method\":\"post\",\"timeout\":3000,\"headers\":{\"Content-Type\":\"application/json\"}}. The error is Error: timeout of 3000ms exceeded"

colleenXu commented 3 years ago

@andrewsu @ariutta I believe this issue arises because BTE actually doesn't retrieve any nodes/answers for e1.

In this situation, BTE is currently returning everything it found for the e0 step, even though it didn't find anything for the e1 step (so there are no complete paths to put in the results).

[EDIT:] We want BTE's behavior to return nothing (empty knowledge_graph / results sections) if it doesn't find any complete paths.

Marco has been dealing with related behavior (with the new querying logic).

ariutta commented 3 years ago

@colleenXu, I think you're correct about e1.

Going back to @andrewsu's first post on this issue, I'm wondering how we want to handle the case where one of the forks in the query graph has zero hits but the other fork has at least one hit.

Update: @andrewsu said that if any link in the query graph is broken, we should return zero results.

ariutta commented 3 years ago

If any link in the query graph is broken, we want to return zero results. If we can safely assume that the query graph is always linear, then the code in branch fix_issue_164b will give us what we want.

If we allow non-linear query graphs (two examples below), the code will always return at least one result as long as there's at least one path from a start node to an end node, e.g., if there are records connecting Gene -> Disease -> ChemicalSubstance but no records connecting Gene -> BiologicalProcess -> ChemicalSubstance, the code will still return results.

We could refactor the code to observe the query graph from the record(s) passed into queryResult.update([record]) calls, but if there are zero records for BiologicalProcess -> ChemicalSubstance, we would never call queryResult.update with a record telling us this hop in the query graph even exists.

image

image

TLDR

  1. Can we assume linear query graphs for now?
  2. If no, how should QueryResult know what the query graph looks like in the case where there are zero records returned for one or more edges of the query graph?
andrewsu commented 3 years ago
  1. Can we assume linear query graphs for now?

First, do we agree that "linear" is irrespective of the directionality of the edges? so A --> B <-- C is as linear as A --> B --> C? If yes, then I am fine making the assumption for now that all query graphs will be linear. If that's the route we go, then it would be best if we can somehow detect non-linear query graphs and proactively return an error that says "we don't support this" rather than failing in non-obvious or unpredictable ways.

  1. If no, how should QueryResult know what the query graph looks like in the case where there are zero records returned for one or more edges of the query graph?

Hmm, I'm afraid this is a bit too deep into the implementation for me to make any intelligent comment. But perhaps this is an argument that the pruning of the KG needs to be performed at the time of assembling the results object, rather than iteratively done as each edge is queried?

ariutta commented 3 years ago
  1. Can we assume linear query graphs for now?

It ended up being most practical to just handle all possible query graphs rather than trying to separate out and return errors for special cases.

This issue should now be resolved in this commit: https://github.com/biothings/bte_trapi_query_graph_handler/pull/34

colleenXu commented 3 years ago

I consider this closed since all of my tested queries have had expected behavior (each result is a unique set of nodes that satisfies the query-graph, with the edges between the nodes).