googleapis / nodejs-bigquery

Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
https://cloud.google.com/bigquery/
Apache License 2.0
464 stars 210 forks source link

BigQuery Response Time is very high #1383

Closed ParthGoyal1508 closed 1 month ago

ParthGoyal1508 commented 2 months ago

I'm attempting to fetch data from BigQuery using the Node.js API with the provided code snippet:

import { BigQuery } from "@google-cloud/bigquery";
  class BQ {
    private static bigqueryClient: BigQuery;
    static getInstance() {
      if (!this.bigqueryClient) {
        let options = {
          keyFilename: <KEY_FILE_PATH>,
          projectId: <PROJECT ID>,
        };
        this.bigqueryClient = new BigQuery(options);
      }
      return this.bigqueryClient;
    }
  }
  const [result] = await BQ.getInstance().query({ query: <QUERY_STRING> });

However, I'm experiencing longer response times compared to what's observed in the BigQuery console. The console shows around ~400ms (Elapsed Time, without cache) for the same query, whereas the code snippet takes around ~3-4 seconds, despite the table returning only one row. Could you please advise on how to optimize and reduce the latency in this scenario? Thank you.

leahecole commented 2 months ago

Hey @ParthGoyal1508! Some more questions - what version of node, npm, and the version of the library are you using?

ParthGoyal1508 commented 2 months ago

Hi @leahecole I'm using following versions:

@google-cloud/bigquery: "^6.0.3" nodeVersion: 16.19.1 npmVersion: 6.14.12

ParthGoyal1508 commented 2 months ago

@leahecole could you please help me with this issue?

alvarowolfx commented 2 months ago

Hey @ParthGoyal1508, sorry for the late reply on that, I was out on vacation.

We recently landed on some query improvements on the NodeJS SDK, on v7.6.0, which uses the jobs.query endpoint from BigQuery v2 to run queries faster in some scenarios.

I ran some tests here and one thing to keep in mind is that authentication can take some time, so is important to reuse the client. I wrote a script to separate the time taken for each step and you can see that reusing the client and after authentication, queries takes around 500ms (and I'm hitting the US region from Brazil).

Test script:

'use strict';

const {GoogleAuth} = require('google-auth-library');

function main() {
  const {BigQuery} = require('../build/src');
  const {version} = require('../package.json');

  async function queryPerf() {
    console.log('Node SDK Version:', version);
    console.log();

    const initAuth = `${process.env.INIT_AUTH}`.toLowerCase() === 'true';

    let authClient;
    if (initAuth) {
      console.time('instantiate auth client');
      authClient = new GoogleAuth({
        projectId: `PROJECT_ID`,
        keyFilename:
          'PATH_TO_SERVICE_ACCOUNT',
      });
      await authClient.getAccessToken(); // force auth init
      console.timeEnd('instantiate auth client');
    }

    console.time('instantiate BigQuery client');
    const bigqueryClient = new BigQuery({
      authClient: authClient
    });
    console.timeEnd('instantiate BigQuery client');
    console.log();

    const sqlQuery = 'SELECT 1';

    console.time('1st run query');
    let [rows, q, response] = await bigqueryClient.query(sqlQuery);
    console.timeEnd('1st run query');
    console.log('Query:', sqlQuery);
    console.log('Job ID:', q.job.id);
    console.log('Query Results:', rows);
    console.log();

    console.time('2nd run query');
    [rows, q, response] = await bigqueryClient.query(sqlQuery);
    console.timeEnd('2nd run query');
    console.log('Query:', sqlQuery);
    console.log('Job ID:', q.job.id);
    console.log('Query Results:', rows);
  }
  queryPerf();
}

main(...process.argv.slice(2));

Results with latest version of the BigQuery Node SDK (v7.7.1)

➜  nodejs-bigquery git:(main) ✗ time node samples/queryPerf.js               
Node SDK Version: 7.7.1

instantiate BigQuery client: 0.308ms

1st run query: 1.576s
Query: SELECT 1
Job ID: job_sWTWPtFzE379NUervxcL2cFT0OKn
Query Results: [ { f0_: 1 } ]

2nd run query: 547.743ms
Query: SELECT 1
Job ID: job_jjfkzQkx0wY8LKwPK6y0bZJlyJnn
Query Results: [ { f0_: 1 } ]
node samples/queryPerf.js  0.42s user 0.15s system 24% cpu 2.345 total

➜  nodejs-bigquery git:(main) ✗ time node samples/queryPerf.js
Node SDK Version: 7.7.1

instantiate BigQuery client: 0.294ms

1st run query: 1.382s
Query: SELECT 1
Job ID: job_qbU9S6MvCDP8G0e4Xu7loPe0QSYi
Query Results: [ { f0_: 1 } ]

2nd run query: 623.907ms
Query: SELECT 1
Job ID: job__NA7fs_FBKPDIgvp7medC0GswWTY
Query Results: [ { f0_: 1 } ]
node samples/queryPerf.js  0.40s user 0.09s system 23% cpu 2.117 total

➜  nodejs-bigquery git:(main) ✗ time node samples/queryPerf.js
Node SDK Version: 7.7.1

instantiate BigQuery client: 0.328ms

1st run query: 1.412s
Query: SELECT 1
Job ID: job_KldtFynTX6wWho1pJN_hs_mh2eZN
Query Results: [ { f0_: 1 } ]

2nd run query: 526.74ms
Query: SELECT 1
Job ID: job_NeHOUW7H59utj6nqWzgpXLdP4Nv-
Query Results: [ { f0_: 1 } ]
node samples/queryPerf.js  0.41s user 0.09s system 24% cpu 2.050 total

Results with latest version of BigQuery Node SDK (v7.7.1) and instantiating Auth separately:

➜  nodejs-bigquery git:(main) ✗ time INIT_AUTH=true node samples/queryPerf.js
Node SDK Version: 7.7.1

instantiate auth client: 318.25ms
instantiate BigQuery client: 0.475ms

1st run query: 743.654ms
Query: SELECT 1
Job ID: job_d_hjp7LatamFlAFw6BVPjHmQyVkm
Query Results: [ { f0_: 1 } ]

2nd run query: 486.248ms
Query: SELECT 1
Job ID: job_EiNnCEUxtXx1i9fDRGiS4ia4OT5i
Query Results: [ { f0_: 1 } ]
INIT_AUTH=true node samples/queryPerf.js  0.13s user 0.02s system 9% cpu 1.657 total

➜  nodejs-bigquery git:(main) ✗ time INIT_AUTH=true node samples/queryPerf.js
Node SDK Version: 7.7.1

instantiate auth client: 372.904ms
instantiate BigQuery client: 0.82ms

1st run query: 662.916ms
Query: SELECT 1
Job ID: job_fBf6iww9JemxLv4Qnlo7ug9lA3yX
Query Results: [ { f0_: 1 } ]

2nd run query: 618.017ms
Query: SELECT 1
Job ID: job_uty5KU5qRW3iO7xvLc1PcO4y8htJ
Query Results: [ { f0_: 1 } ]
INIT_AUTH=true node samples/queryPerf.js  0.14s user 0.03s system 9% cpu 1.765 total

➜  nodejs-bigquery git:(main) ✗ time INIT_AUTH=true node samples/queryPerf.js
Node SDK Version: 7.7.1

instantiate auth client: 331.183ms
instantiate BigQuery client: 0.833ms

1st run query: 663.205ms
Query: SELECT 1
Job ID: job_36zqgqYAxysoIHiKJDKpddgmnX-x
Query Results: [ { f0_: 1 } ]

2nd run query: 486.983ms
Query: SELECT 1
Job ID: job_yFYub4CQsDOXpZrA5eAM5qJpGm-2
Query Results: [ { f0_: 1 } ]
INIT_AUTH=true node samples/queryPerf.js  0.14s user 0.03s system 10% cpu 1.594 total

For comparison with the old versions before v7.6.0.

➜  nodejs-bigquery git:(v7.5.0) ✗ time node samples/queryPerf.js
Node SDK Version: 7.5.0

instantiate BigQuery client: 0.33ms

1st run query: 1.796s
Query: SELECT 1
Job ID: b077dc97-3668-45d0-9d6c-4ded8a6d98f1
Query Results: [ { f0_: 1 } ]

2nd run query: 801.428ms
Query: SELECT 1
Job ID: 22c7e696-48be-4877-b1f8-ebad35135ca4
Query Results: [ { f0_: 1 } ]
node samples/queryPerf.js  0.43s user 0.09s system 19% cpu 2.708 total

➜  nodejs-bigquery git:(v7.5.0) ✗ time node samples/queryPerf.js
Node SDK Version: 7.5.0

instantiate BigQuery client: 0.279ms

1st run query: 1.798s
Query: SELECT 1
Job ID: f7e460bd-be9b-4633-908b-41f30456a720
Query Results: [ { f0_: 1 } ]

2nd run query: 926.084ms
Query: SELECT 1
Job ID: 75288bee-0ca7-46de-8f19-7aba4e9d9c39
Query Results: [ { f0_: 1 } ]
node samples/queryPerf.js  0.40s user 0.09s system 17% cpu 2.833 total

➜  nodejs-bigquery git:(v7.5.0) ✗ time node samples/queryPerf.js
Node SDK Version: 7.5.0

instantiate BigQuery client: 0.266ms

1st run query: 2.088s
Query: SELECT 1
Job ID: 84739bc5-7ebd-478e-a6d9-e77c3c27926c
Query Results: [ { f0_: 1 } ]

2nd run query: 817.641ms
Query: SELECT 1
Job ID: c3b2ce05-0f4f-44c1-af92-9e27e434854e
Query Results: [ { f0_: 1 } ]
node samples/queryPerf.js  0.43s user 0.10s system 17% cpu 3.018 total
alvarowolfx commented 1 month ago

closing due to inactivity.