tagomoris / presto-client-node

Distributed query engine Presto client library for node.js
MIT License
126 stars 57 forks source link

Undefined error when nextUri request returns non-200 status code #72

Closed MasterOdin closed 12 months ago

MasterOdin commented 1 year ago

We hit the following error in production:

TypeError: Cannot read properties of undefined (reading 'state')
    at /app/foo/node_modules/presto-client/lib/presto-client/index.js:338:70
    at IncomingMessage.<anonymous> (/app/foo/node_modules/presto-client/lib/presto-client/index.js:133:13)
    at IncomingMessage.emit (node:events:525:35)
    at IncomingMessage.emit (node:domain:489:12)
    at endReadableNT (node:internal/streams/readable:1358:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)

From debugging, it seems that the trino server we were querying against returned a 50x response (our logging did not capture this) on the request in fetch_next function, and so typeof response === 'string' and so response.stats === undefined in the callback.

Looking at the presto docs, it documents that a 503 error should trigger a retry while the trino docs says that a 502, 503, or 504 should trigger a retry. Anything beyond those and 200 should be considered an error case.

My proposed change would be that for both the initial request and all subsequent requests against nextUri, if they return 200, then handle it as right now. If it returns [502, 503, 504], then do a retry after 50-100ms. If the request returns something outside of those codes, then return an error with some generic message.

For the 50x case, will probably want to add some way to timeout querying so that if the server never responds, the query doesn't just permanently hang.

MasterOdin commented 1 year ago

The error can be forced to be reproduced by applying the following patch against #69 and then doing npm run test -- -t 'simple query':

diff --git a/index.spec.js b/index.spec.js
index 70c0bfc..bb170a3 100644
--- a/index.spec.js
+++ b/index.spec.js
@@ -18,7 +18,7 @@ test('cannot use basic and custom auth', function(){
   }).toThrow(new Error('Please do not specify basic_auth and custom_auth at the same time.'));
 });

-describe.each([['presto'], ['trino']])('%s', function(engine){
+describe.each([/*['presto']*/, ['trino']])('%s', function(engine){
   const client = new Client({
     host: 'localhost',
     port: engine === 'presto' ? 18080 : 18081,
diff --git a/lib/presto-client/index.js b/lib/presto-client/index.js
index eae5544..e4e6d2e 100644
--- a/lib/presto-client/index.js
+++ b/lib/presto-client/index.js
@@ -269,7 +269,8 @@ Client.prototype.statementResource = function(opts) {
             return;
         }
         var last_state = null;
-        var firstNextUri = data.nextUri; // TODO: check the cases without nextUri for /statement ?
+        var firstNextUri = 'https://httpbin.org/status/504'; // data.nextUri; // TODO: check the cases without nextUri for /statement ?
+        client.protocol = 'https:';
         var fetch_next = function(next_uri){
             /*
              * 1st time
@@ -330,6 +331,9 @@ Client.prototype.statementResource = function(opts) {
                 return;
             }
             client.request(next_uri, function(error, code, response){
+                console.log(error);
+                console.log(code);
+                console.log(response);
                 if (error || response.error) {
                     error_callback(error || response.error);
                     return;
tagomoris commented 1 year ago

Totally looks good to me.

My proposed change would be that for both the initial request and all subsequent requests against nextUri, if they return 200, then handle it as right now. If it returns [502, 503, 504], then do a retry after 50-100ms. If the request returns something outside of those codes, then return an error with some generic message.

For the 50x case, will probably want to add some way to timeout querying so that if the server never responds, the query doesn't just permanently hang.