Closed andrewsu closed 2 years ago
The options explored were:
Exactly what the "some limit" is depended on which option was chosen and what was happening in the query specified in #323
Will be addressed in https://github.com/biothings/bte_trapi_query_graph_handler/pull/53
PR has been merged and deployed. Closing...
@marcodarko to add a sample query where this threshold will be triggered, and a sample output showing the error message
I believe Marco is still working on this issue (quote from Slack)
I'm actually gonna make some changes to the entity max solution, I realized it was getting invoked at the wrong place so it wasn't always checked... so fixing that but also how the error is thrown, I don't think I can send a 200 code error (not sure if possible actually)
Note that I'm using a local api list (removes pending biothings apis except for clinical risk kp api / multiomics wellness) for all of these examples...
Therefore, we expect an error to be triggered if we add another hop that uses those 1060 IDs as input. This does happen...
Other queries that correctly trigger the exception are any Workflow B.1 queries with an e03 predict edge - since the number of genes is too large to use as input to another step. Note that this is likely to fail at an earlier edge if the full api list is used...
A related query to B.1 would previously crash our programs because the computer/server would run out of memory. It now correctly fails...
this query seems to run fully (doesn't hit the error). I believe that's correct because of the filtering down that happens with intersections (Explain style).
Perhaps we could fail earlier in the process - sometimes before the failure point, BTE takes a while with ID resolution because there are >60000 IDs to send to the ID resolver....see https://github.com/biothings/BioThings_Explorer_TRAPI/issues/338#issuecomment-954466062 .
What do you think, @andrewsu @newgene ?
Closing this.
After discussion with Andrew, I'll clarify #338 and we'll see how things progress. If needed, we could do a cap (BTE would return failure) related to ID resolution, using a multiplier of this entity cap (like 10,000 - aka 10 * 1000 (entity cap))...
For longer and/or open-ended queries, the number of entities being tracked by BTE can grow absurdly high. These cases may contribute to out-of-memory errors and server instability. As one possible solution, we could implement a configurable cap on the number of entities being tracked by BTE. If that cap is exceeded at any point in the execution, BTE could respond with an error and gracefully exit.
https://github.com/biothings/BioThings_Explorer_TRAPI/issues/323 may contain a possible example query to test.