Closed andrewsu closed 3 years ago
Further testing, on my local I consistently receive 2083 results from both endpoints, whereas currently on prod I receive 1113 on /query and 790 on /asyncquery, with tests for either endpoint being run within minutes of each other.
@tokebe Can you see in the logs any obvious differences in which APIs are being called, or the number of results being returned?
What immediately jumps out is that two queries to https://mychem.info/v1/query
are failing on the async endpoint with an error 'getaddrinfo ENOTFOUND mychem.info`, which is a very odd error to say the least. These account for all missing results from async, going by the number of hits on the sync endpoint.
Just a note that this query https://github.com/NCATSTranslator/testing/blob/main/ars-requests/not-none/1.2/D.1-hgd-alkaptonuria.json resulted in the same error from https://api.bte.ncats.io/v1/asyncquery/ (https://api.bte.ncats.io/v1/check_query_status/kVmlkzvxkc):
{
"id": "kVmlkzvxkc",
"state": "completed",
"returnvalue": {
"response": {
"error": "Error",
"message": "getaddrinfo ENOTFOUND mychem.info"
},
"callback": ""
},
"progress": 0
}
Results returned fine from the sync endpoint https://api.bte.ncats.io/v1/query/
Probably a DNS issue for the instance we route asyncquery over.
DNS issue should have been fixed (one way or another), the following domains are resolvable and works within the VPC:
Let me know if more are needed and I will fix that.
Verified the instance running bte async queries can resolve these biothings API hostnames now.
@andrewsu if you can verify the asyncquery results are the same as those from the sync query with your test queies, we can close this issue now.
@andrewsu @newgene Confirmed, both sync and async are both returning 2414 results, with no subqueries failing due to either ENOTFOUND nor timeouts.
I tested https://github.com/NCATSTranslator/minihackathons/blob/main/2021-12_demo/workflowB/B.2a_DILI-fourth-one-hop-from-CHEBI_41879_Dexamethasone.json on both the sync and async endpoint (on the prod instance), and I also get the same number of results (1441). (I'll note that in my testing I did get one response on the sync endpoint with a smaller number of results, but that appears to be due to a timeout on one of the APIs. And I've rerun it on sync again and gotten the expected number.) So closing this issue!
The workflow progress tracker ran Query B.2a: https://github.com/NCATSTranslator/minihackathons/blob/main/2021-12_demo/workflowB/B.2a_DILI-fourth-one-hop-from-CHEBI_41879_Dexamethasone.json via both a synchronous query (https://arax.ncats.io/?r=ae038a8a-e0dc-44a5-962d-efbdead8eae0 (2817 results)) and via async query (https://arax.ncats.io/?r=1ef52ad6-a081-4b98-bb39-147c65e92982 (1609 results)), within a few hours of each other. The number of results returned differs between these two calls. I suspect it has to do with a timeout/availability issue, but as we shift toward pushing the use of async over sync, would just like to confirm that there is isn't some difference in logic that slipped through...