ibmdb / node-ibm_db

IBM DB2 and IBM Informix bindings for node
MIT License
188 stars 151 forks source link

ibm_db runs queries sequentially #916

Closed NahinChowdhury closed 1 year ago

NahinChowdhury commented 1 year ago

Is your feature request related to a problem? Please describe. ibm_db seems to run queries sequentially instead of concurrently. This heavily impacts our application performance when we have more than 20 concurrent users. We set our pool.init to be more than the concurrent users, so pool initialization doesn't seem to be a problem

Describe the solution you'd like It would be great if there was a way to run the queries concurrently.

Describe alternatives you've considered N/A

Additional context I created a script that makes requests to the database directly. In the output, we can see the gradual increase in response time as the number of concurrent queries increases. A query that takes 60ms to execute with 1 user will take 120ms, on average, to execute with 5users.

We also noticed a gradual ~30ms delay between the response times of each query. Not sure why that’s the case either. I set the maxPoolSize to be +1 of the number of concurrent queries we are running, so my queries are being sent virtually together(with a difference of 1-4ms between the sending time of the first and last query). Additionally, when calculating the response time, the script starts the timer right before sending the query.

We use ibm_db@2.7.0 in our application, but I tested the query with the latest ibm_db version(@3.1.0)

Any help would be greatly appreciated!

I have attached the script and some example output below:

1 query: Pasted Graphic

5 concurrent queries: Pasted Graphic 1

10 concurrent queries: Pasted Graphic 3

20 concurrent queries: Pasted Graphic 2

dbPerformanceTest.js.zip

huineng commented 1 year ago

i think your test is wrong , that is it's coded to serialize it happens when you already await your promise in your loop

Try to refactor first to create an array of promises without awaiting and then use that array in your promise.all

it's you await below that causes this, try to remove it

ids.map(async (id) => {
                try {
                    await new Promise((resolve, reject) => {
NahinChowdhury commented 1 year ago

Thanks for the suggestion!

I made the changes you suggested and it doesn’t seem to make a difference. I even tried to create an array of promises before the promise.all, but I ran into a problem where the promises were running before the promise.all(). I confirmed this by adding a console.log inside the Promises and also before running Promise.all(). The console.log inside the Promises run before the "Running after promise is created" console.log.

Pasted Graphic 1

After removing the await for the promise, I also noticed that the promises were ending before runtimes variable was populated. So I ran a setInterval that ran every 10 seconds to see the updated runtime. The observations were the same as the observations I had previously.

We still think that we are facing this issue because db.query might not be working concurrently. Below I have attached a picture of the start times of the queries and you will see that they are sent within 1-2 milliseconds of each other.

Pasted Graphic 2

Outputs

Here are the runtimes for the script I provided in this issue(The runtimes are slower than the ones in my previous comment because I am running these queries for a different location): 1 query: Pasted Graphic 17 5 query: Pasted Graphic 18 10 query: Pasted Graphic 19 20 query: Pasted Graphic 20
Here are the new runtimes with the proposed changes of removing await for the promises and creating the promises before the Promise.all() (I ran awaitAndArray.js): 1 query: Pasted Graphic 8 5 queries: Pasted Graphic 9 10 queries: Pasted Graphic 11 20 queries: Pasted Graphic 12
Here are the new runtimes with the proposed changes of removing await for the promises (I ran justAwait.js): 1 query: Pasted Graphic 13 5 queries: Pasted Graphic 14 10 queries: Pasted Graphic 15 20 queries: Pasted Graphic 16

If you still think the way we are running our script incorrectly, could you give us an example of how the right approach would look like?

I have attached the modified files I used for my testing.

Modified scripts.zip

NahinChowdhury commented 1 year ago

HI @huineng, did you get a chance to look at my comment?

bimalkjha commented 1 year ago

@NahinChowdhury Modify your runAllPromises function as below code:

    async function runAllPromises(){
        return Promise.all(
            ids.map((id) => {
        try {
            new Promise((resolve, reject) => {
                console.time(id + " Pool creation time")
                pool.open(connString, (err, db) => {
                    console.timeEnd(id + " Pool creation time")
                    if (err) {
                        reject(new Error("Error"));
                        return;
                    }

                    console.log(id + " Pool size after opening: " + pool.poolSize)
                    try{
                        console.log(id + " query sent!")
                        console.time(id)
                        const startTime = new Date();
                        console.log(id + " - start time: " + getSqlTime(startTime))
                        let rows = db.querySync(query[queryIndex]);
                        db.closeSync();
                                                // console.log(id + " Pool size after close: " + pool.poolSize)
                                                const endTime = new Date()
                                                runTimes[id] = endTime - startTime;
                                                console.log(id + " - end time: " + getSqlTime(endTime))
                                                console.timeEnd(id)
                                                resolve(rows);
                    }catch(err){
                        reject(new Error("Query error"))
                    }finally{
                        // console.log(id + " Pool size ending: " + pool.poolSize)
                    }
                  })
                });
                } catch (err) {
                    console.error(err);
                }
            })
        )
    }

The db.query() function submits sql to a simplequeue() from where it get executed one by one, else insert query may get executed before create table query. The use of async await is making it pure sequential. Use the synchronous function db.querySync() to fire all queries at once and you'll get results in parallel. I am getting average time as half seconds. Thanks.

bimalkjha commented 1 year ago

@NahinChowdhury I would suggest to go through this post first: https://betterprogramming.pub/execution-types-models-in-node-js-46f70d71abb#:~:text=There%20are%20three%20types%20of,takes%20some%20time%20to%20complete.

Then, check the closed ibm_db issue #210 that has detail discussion about this behavior, then check my this comment about need to set UV_THREADPOOL_SIZE to boost performance. Thanks.

bimalkjha commented 1 year ago

@NahinChowdhury Is there anything that we can help about this issue? Else can we close it? Thanks.

NahinChowdhury commented 1 year ago

Hi @bimalkjha sorry I forgot to comment before. You can close the issue. Thanks!

bimalkjha commented 11 months ago

Attaching the full test file here for reference. If you can call ibm_db.openSync() in non sequential way, you can get better result. But, that is not possible as nodejs is single threaded. Irrespective of value of runs, below code always takes similar time in ibm_db to open connection and execute query:

const ibm_db = require("ibm_db");
const config=require("config")
const uuid = require('uuid')

function getSqlTime(date){
    const sqlDate =
        ("00" + date.getHours()).slice(-2) +
        ":" +
        ("00" + date.getMinutes()).slice(-2) +
        ":" +
        ("00" + date.getSeconds()).slice(-2) +
        ":" +
        ("00" + date.getMilliseconds());
    return sqlDate
}

const query = [
    `SELECT 1 FROM sysibm.sysdummy1`];

async function main() {
    const queryIndex = 0;
    console.log("Query index: " + queryIndex)
    const runTimes = {};
    const connString = `DATABASE=${config.get('db.DATABASE')};HOSTNAME=${config.get('db.HOSTNAME')};PORT=${config.get('db.PORT')};UID=${config.get('db.UID')};PWD=${config.get('db.PWD')};Security=SSL;`;

    const runs = 20;

    console.log("Concurrent queries running: " + runs)
    const ids = [...new Array(runs)].map(id => uuid.v4());
    console.log("Query IDs in order")
    console.log(ids)

    await runAllPromises()
    .then((rows) => {
        setTimeout(() => {
          console.log("Runtimes = ", runTimes)
          const average = Object.values(runTimes).reduce((a, b) => a + b, 0) / Object.keys(runTimes).length;
          console.log("Average runtime: " + average + "ms")
        }, 5000 );
    });
    async function runAllPromises(){
        return Promise.all(
            ids.map(async (id) => {
                const startTime = new Date();
                console.log(id + " - start time: " + getSqlTime(startTime))
                try {
                    new Promise((resolve, reject) => {
                          try{
                              let db = ibm_db.openSync(connString, {connectTimeout: 120});
                              let rows = db.querySync(query[queryIndex]);
                              const endTime = new Date()
                              runTimes[id] = endTime - startTime;
                              db.closeSync();
                              //console.log("Data for id ", id, " = ", rows);
                              resolve(rows);
                          }catch(err){
                              reject(err)
                          }
                    });
                } catch (err) {
                    console.error(err);
                }
            })
        )
    }
}

main();

$ node test.js Query index: 0 Concurrent queries running: 20 Query IDs in order [ '92db1c46-d899-4175-af6a-aed098e70d46', '3b90264f-e79b-4fa4-b65a-c3eb576b60e3', '81baa75c-f4d5-459a-a341-114ede3252e1', '3cff6911-eee0-4bf7-a361-806379632886', '80434db0-8254-4bcc-b65b-c38b3854efd4', '92ce34b3-6b60-4db6-bcae-254f3d8a9ece', '6b178276-d74d-4739-8f12-aa8cf320b314', 'bb51fb2b-84f8-496e-b48b-1672683ab64e', '6a7fd78a-bfd0-4e1b-a182-c7f3473669cb', '0250e143-75c0-427e-9be4-828fe9d083d1', 'fa869924-f6ec-48fe-a280-cce2259257c2', 'f37c162a-c8b4-41bd-bae2-2fa7026b3e0f', '1f0135ae-7759-4637-bb78-e2f11496df9a', 'c64f1772-f456-43a1-91f7-d5d8acc67946', 'c19e906d-f751-4b6a-9fae-b0a60253fb4f', 'dec4f233-ae72-4d01-879d-7ea13e6c1502', '9197b826-c771-4b5f-a23d-e4af9694f095', '5e95cc29-a8fa-4cb7-aca3-49614ddece6e', '6d950c68-f74e-4031-b2f8-9dcbd329a04e', '8c8ec76f-f11c-4f7c-872f-04219adb0a06' ] 92db1c46-d899-4175-af6a-aed098e70d46 - start time: 06:38:35:00123 3b90264f-e79b-4fa4-b65a-c3eb576b60e3 - start time: 06:38:35:00253 81baa75c-f4d5-459a-a341-114ede3252e1 - start time: 06:38:35:00395 3cff6911-eee0-4bf7-a361-806379632886 - start time: 06:38:35:00489 80434db0-8254-4bcc-b65b-c38b3854efd4 - start time: 06:38:35:00583 92ce34b3-6b60-4db6-bcae-254f3d8a9ece - start time: 06:38:35:00679 6b178276-d74d-4739-8f12-aa8cf320b314 - start time: 06:38:35:00827 bb51fb2b-84f8-496e-b48b-1672683ab64e - start time: 06:38:35:00944 6a7fd78a-bfd0-4e1b-a182-c7f3473669cb - start time: 06:38:36:0037 0250e143-75c0-427e-9be4-828fe9d083d1 - start time: 06:38:36:00153 fa869924-f6ec-48fe-a280-cce2259257c2 - start time: 06:38:36:00247 f37c162a-c8b4-41bd-bae2-2fa7026b3e0f - start time: 06:38:36:00341 1f0135ae-7759-4637-bb78-e2f11496df9a - start time: 06:38:36:00438 c64f1772-f456-43a1-91f7-d5d8acc67946 - start time: 06:38:36:00555 c19e906d-f751-4b6a-9fae-b0a60253fb4f - start time: 06:38:36:00673 dec4f233-ae72-4d01-879d-7ea13e6c1502 - start time: 06:38:36:00790 9197b826-c771-4b5f-a23d-e4af9694f095 - start time: 06:38:36:00906 5e95cc29-a8fa-4cb7-aca3-49614ddece6e - start time: 06:38:37:0048 6d950c68-f74e-4031-b2f8-9dcbd329a04e - start time: 06:38:37:00187 8c8ec76f-f11c-4f7c-872f-04219adb0a06 - start time: 06:38:37:00281 Runtimes = { '92db1c46-d899-4175-af6a-aed098e70d46': 117, '3b90264f-e79b-4fa4-b65a-c3eb576b60e3': 121, '81baa75c-f4d5-459a-a341-114ede3252e1': 82, '3cff6911-eee0-4bf7-a361-806379632886': 82, '80434db0-8254-4bcc-b65b-c38b3854efd4': 83, '92ce34b3-6b60-4db6-bcae-254f3d8a9ece': 128, '6b178276-d74d-4739-8f12-aa8cf320b314': 100, 'bb51fb2b-84f8-496e-b48b-1672683ab64e': 81, '6a7fd78a-bfd0-4e1b-a182-c7f3473669cb': 100, '0250e143-75c0-427e-9be4-828fe9d083d1': 81, 'fa869924-f6ec-48fe-a280-cce2259257c2': 81, 'f37c162a-c8b4-41bd-bae2-2fa7026b3e0f': 85, '1f0135ae-7759-4637-bb78-e2f11496df9a': 101, 'c64f1772-f456-43a1-91f7-d5d8acc67946': 101, 'c19e906d-f751-4b6a-9fae-b0a60253fb4f': 101, 'dec4f233-ae72-4d01-879d-7ea13e6c1502': 100, '9197b826-c771-4b5f-a23d-e4af9694f095': 121, '5e95cc29-a8fa-4cb7-aca3-49614ddece6e': 119, '6d950c68-f74e-4031-b2f8-9dcbd329a04e': 81, '8c8ec76f-f11c-4f7c-872f-04219adb0a06': 100 } Average runtime: 98.25ms $ Thanks.

bimalkjha commented 11 months ago

From the above result, we can see the last new Promise() is called sequentially only and not concurrently. The illusion of concurrent queries has been created by using setTimeout() or setInterval(), but when we moved the startTime variable before new Promise call, it is clear that all promises are getting called sequentially only and not concurrently. There can't be a concurrent requests in node.js. So, using connection pool in the above shared test program will give best performance. Thanks.