brianc / node-postgres

PostgreSQL client for node.js.
https://node-postgres.com
MIT License
12.37k stars 1.23k forks source link

Client side socket timeout (not statement_timeout) #3124

Open omerd-cyera opened 10 months ago

omerd-cyera commented 10 months ago

Hi, I'm wondering if there is an option to set a client side query timeout for a pool. I have a network issue, where the socket stays open, but is no longer connected to postgres (im using some sort of proxy). This can be replicated by a simple nc -nvlp 5432.

I am aware of statement_timeout, but this is a server side timeout which will not work in my case. Ideally, I would like an option like socket_read_timeout, that will throw an error. Thanks!

Safricloud commented 9 months ago

My Azure App Service is doing the same thing.

omerd-cyera commented 9 months ago

@Safricloud Encountered this on azure too, when using a virtual network gateway. I ended up upgrading it, and it stopped dropping the connections

c032 commented 9 months ago

@omerd-cyera

This can be replicated by a simple nc -nvlp 5432.

I can't run that command (nc: cannot use -p and -l), but assuming nc -nvl 5432 is the same as what you want, that one can be solved by the connectionTimeoutMillis option.

Reproducing a connection that hangs specifically during a call to the query method looks a bit more involved. But glancing at the code, it seems that setting query_timeout should cancel the query client-side (test).

Another useful setting to keep in mind during these situations (though it wouldn't have helped here specifically) is setting keepAlive to true.

omerd-cyera commented 9 months ago

one can be solved by the connectionTimeoutMillis option.

You are correct, I had this issue both while creating a new connection, And when waiting for a response on a query. I managed to replicate the second one with a simple python proxy, and relays the traffic. The script would also wait for a key to be pressed, and stop relaying packets between the sockets. This replicated my issue locally. Here is the script, if you want to try and replicate it too

```python import socket import sys import threading import time import select PROXY_PORT = 54321 POSTGRES_PORT = 5432 POSTGRES_HOST = 'localhost' HALT_PROXY_KEY = 'q' HALT_PROXY_TRANSFER_KEY = 'w' def main(): proxy_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) proxy_socket.bind(('', PROXY_PORT)) proxy_socket.listen(10) print(f"Press {HALT_PROXY_KEY} to halt proxy, {HALT_PROXY_TRANSFER_KEY} to halt transfer") try: while True: client_socket, addr = proxy_socket.accept() print('Got a connection from %s' % str(addr)) thread = threading.Thread(target=handle_request, args=(client_socket,)) thread.start() except: proxy_socket.close() def handle_request(client_socket): postgres_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) postgres_socket.connect((POSTGRES_HOST, POSTGRES_PORT)) postgres_socket.settimeout(1) sockets = [client_socket, postgres_socket] stop_transfer = False while True: if sys.stdin in select.select([sys.stdin], [], [], 0)[0]: key = sys.stdin.read(1) if key == HALT_PROXY_KEY: break elif key == HALT_PROXY_TRANSFER_KEY: stop_transfer = True try: read_sockets, write_sockets, error_sockets = select.select(sockets, [], []) for sock in read_sockets: if sock == client_socket: data = client_socket.recv(1024) if not data: break print('> Packet from client') if not stop_transfer: postgres_socket.sendall(data) elif sock == postgres_socket: data = postgres_socket.recv(1024) if not data: break print('< Packet from postgres') if not stop_transfer: client_socket.sendall(data) except socket.timeout: print('timeout') pass except socket.error: print('error') break client_socket.close() postgres_socket.close() if __name__ == '__main__': main() ```

glancing at the code, it seems that setting query_timeout should cancel the query client-side (test).

Can't remember exactly now why I decided it didn't work. I might try to get back to replicating this issue later this week with the proxy.

c032 commented 9 months ago

@omerd-cyera

Yep, I started your script, connected from pg to it, started running queries in a loop, pressed w (and then Enter) in your script, and got an exception from the query method after the expected amount of time (I used 3000 ms in my test script):

client.js ```js import pg from 'pg' const { Client } = pg; function sleep(ms) { return new Promise((resolve) => { setTimeout(() => { resolve(); }, ms); }); } async function main() { const client = new Client({ connectionString: "postgresql://localhost:54321", // This /////////////////////////////////////////////////////////////// query_timeout: 3000, /////////////////////////////////////////////////////////////////////// }) await client.connect(); for (let i = 0; i < 60; i++) { const result = await client.query('select current_timestamp ct;'); console.log(result.rows[0].ct); await sleep(1000); } await client.end(); } main(); ```
/run/user/1000/tmp/node_modules/pg/lib/client.js:526
          Error.captureStackTrace(err);
                ^

Error: Query read timeout
    at /run/user/1000/tmp/node_modules/pg/lib/client.js:526:17
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async main (file:///run/user/1000/tmp/client.js:22:21)

Node.js v20.10.0

Hope this helps.

omerd-cyera commented 9 months ago

Not sure where my tests went wrong earlier. Ill check it out again. Thanks for the help!

Safricloud commented 9 months ago

I was able to solve my problem by setting this in pool connection options:

idleTimeoutMillis: 60000, // 60 seconds

This threshold is the same as what Azure is using on their proxy to timeout connections, so now my app just reconnects when needed and never assumes that a connection is alive when its not.

omerd-cyera commented 9 months ago

@Safricloud If I am not mistaken, this will only fix the problem for idle connections, not for connections that are waiting for a response from the db.

Safricloud commented 9 months ago

For active queries you can set a timeout with this pool/connection option: query_timeout: 30000, // 30 seconds It will throw an error if the query did not get a response within that time frame, but bare in mind that this may impact long queries that will eventually return a result. That's why I preferred to find a way to use a working connection that I could verify.

I was using this getWorkingConnection function to verify my results:

const testConnection = async (connection: PoolClient): Promise<boolean> => {
    const MAX_WAIT = 3 * 1000; // 3 seconds
    // Create a timeout promise that rejects after MAX_WAIT milliseconds
    const timeout = new Promise((_, reject) => {
        setTimeout(() => {
            reject(new Error('Connection test timed out'));
        }, MAX_WAIT);
    });
    try {
        // Create a race between the timeout and the query execution
        await Promise.race([
            connection.query('SELECT 1'), // Execute a simple query to test the connection
            timeout
        ]);
        return true; // The connection is working
    } catch (err) {
        console.error('Connection failed', err);
        return false;
    }
};
const getWorkingConnection = async (retryNumber = 0): Promise<PoolClient> => {
    const MAX_RETRY = pool.totalCount || 10;
    if (retryNumber >= MAX_RETRY) {
        throw new Error('Exceeded maximum attempts to get a working postgreSQL connection');
    }
    const conn = await pool.connect(); // Get a connection from the pool
    const isConnected = await testConnection(conn); // Test it
    if (isConnected) {
        console.log('A working connection has been acquired from the pool.');
        return conn; // Return the working connection
    } else {
        conn.release(); // How can we destroy connections that are not working?
        // Try again
        return getWorkingConnection(++retryNumber);
    }
};
adamwan-nexplore commented 5 months ago

virtual network gateway

Hi, I have a similar issue. I would want to learn more what you have upgraded?