brianc / node-postgres

PostgreSQL client for node.js.
https://node-postgres.com
MIT License
12.26k stars 1.22k forks source link

Proper way to batch queries? #3331

Open rberg89 opened 1 day ago

rberg89 commented 1 day ago

I have a large amount of UPDATE statements to make. The most efficient way I have found, however slow, is:

await Promise.all(largeBatchOfQueries.map(async (query) => {
    const client = await pool.connect();
    await client.query(query.sql, [
        query.param1,
        query.param2,
        query.param3,
        query.param4
    ])
    await client.release();
}));

~160k UPDATEs takes ~2.5 minutes which is too slow for my situation. ~30 max connections in pool gives me the best time.

I saw there is discussion around the topic but not sure if I need a different approach or to add another library. Wondering if anyone has any insight, much appreciated. https://github.com/brianc/node-postgres/issues/1388

boromisp commented 13 hours ago

AFAIK there is no clean solution for now, as pg can't really "pipeline" individual queries. The connection will wait for the completion of the current query before sending the next to the server.

To speed this up you will have to reduce the number of round trips to the server by reducing the number of queries.

The simplest way without changing the queries is interpolating the parameters on the client side and sending the queries in batches. There are a number of libraries that can safely interpolate the parameters in JS.

Or, as described in the linked issue, depending on your specific queries, you could create a temp table to hold your parameters, fill it with pg-copy-streams, and execute your updates in a single step.

There are a few other similar solutions, for example you could create a stored procedure to execute the individual updates and pass it all the values as a single JSONB, or as SQL arrays.