GoogleCloudPlatform / cloud-sql-nodejs-connector

A JavaScript library for connecting securely to your Cloud SQL instances
Apache License 2.0
66 stars 8 forks source link

Why does the Connector handle "connection" event with a single-call handler? #349

Closed tSte closed 2 months ago

tSte commented 3 months ago

Question

The Connector provides capability to connect via UNIX sockets. I'd like to ask if there is a reason why Connector uses .once() instead of .on() when handling new connections. Is this a bug or by design? If it is by design, why? What will happen in case of ORM connects with multiple connections in connection pool (for example setting Prisma connection_limit greater than 1)?

jackwotherspoon commented 3 months ago

Probably just an oversight.

What will happen in case of ORM connects with multiple connections in connection pool (for example setting Prisma connection_limit greater than 1)?

Were you able to give this a try? If so I'd be curious about the results and if it errors with the current approach.

We will happily investigate changing it to .on and maybe update our tests to try multiple connections.

tSte commented 3 months ago

@jackwotherspoon I did not do so with the Connector - we have our internal UNIX socket implementation for two reasons:

  1. Historical - we were using this before it was an option with the Connector.
  2. Logging - the Connector does not really provide any ability to handle errors and/or closing events of sockets and does not seem to gracefully close connections.

Correct me if I'm wrong, but it does not .unpipe() or .end() sockets, rather just plainly calls .destroy() on TLS Socket. Also it does not seem to .end() UNIX socket stream, but only .close() UNIX socket. So while it will stop the server to accept new connection, it does not gracefully clean up existing connections. I have a suspicion that non-graceful termination will emit FATAL: connection to client lost by Cloud SQL in GCP logging with severity of ALERT - we have alerting policy on this one, waking up on-call engineers during night. That said, bear in mind that I'm not exactly an expert on this subject - that's why I'm asking this question.

To answer your question though, our implementation does use .on('connection', ...) event handling and we have 2 connections in connection pool, so we often see multiple connection events in our logs.

jackwotherspoon commented 3 months ago

@tSte I will be looking into this this week and will provide an update soon

rstreefland commented 2 months ago

@jackwotherspoon I can confirm that setting the prisma connection_limit greater than 1 consistently results in a timeout whenever prisma tries to establish more than one connection. With the connection_limit set to 1, I don't see this error at all. This one is also a blocker for us unfortunately because a single connection is insufficient.

Screenshot 2024-06-12 at 17 14 23

jackwotherspoon commented 2 months ago

@rstreefland Thanks for the extra details! Currently working on the fix for this and hoping to have it wrapped up shortly, will cut a release when the fix is complete.

rstreefland commented 2 months ago

@jackwotherspoon Great! Thanks again.

jackwotherspoon commented 2 months ago

@jackwotherspoon I did not do so with the Connector - we have our internal UNIX socket implementation for two reasons:

  1. Historical - we were using this before it was an option with the Connector.
  2. Logging - the Connector does not really provide any ability to handle errors and/or closing events of sockets and does not seem to gracefully close connections.

Correct me if I'm wrong, but it does not .unpipe() or .end() sockets, rather just plainly calls .destroy() on TLS Socket. Also it does not seem to .end() UNIX socket stream, but only .close() UNIX socket. So while it will stop the server to accept new connection, it does not gracefully clean up existing connections. I have a suspicion that non-graceful termination will emit FATAL: connection to client lost by Cloud SQL in GCP logging with severity of ALERT - we have alerting policy on this one, waking up on-call engineers during night. That said, bear in mind that I'm not exactly an expert on this subject - that's why I'm asking this question.

To answer your question though, our implementation does use .on('connection', ...) event handling and we have 2 connections in connection pool, so we often see multiple connection events in our logs.

@tSte I am going to merge the change to .on for establishing connections. I'd be curious to hear more about the .destroy vs .end as it may be something for me to look into as well.

Maybe you can try in a dev instance and see once the change is rolled out if your suspicions are correct about getting a FATAL: connection to client lost by Cloud SQL in GCP logging with severity of ALERT. I was under the assumption that .end only half closes the socket. We may need to add a bit of a logic to have more graceful exits than just .destroy

Happy to discuss more.

jackwotherspoon commented 2 months ago

@rstreefland I have a cut a release (v1.3.2) with the change: https://www.npmjs.com/package/@google-cloud/cloud-sql-connector

tSte commented 2 months ago

@jackwotherspoon IIUC, the .end() will send the FIN packet (graceful connection termination) so I'd go for handling connections in following "fashion" (I'll just write some code out of my head, did not test this!):

const disposeSockets = (proxySocket: Socket, connectorSocket: TLSSocket) => {
  // Unpipe streams first
  connectorSocket.unpipe(proxySocket)
  proxySocket.unpipe(connectorSocket)

  // Gracefully close connections
  [proxySocket, connectorSocket].forEach((s) => {
    s.end(() => s.destroy())
  })
}

and then something like

const openSockets = new Set()
unixProxy.on('connection', (proxySocket) => {
  const connectorSocket = stream() // from Connector.getOptions()
  [proxySocket, connectorSocket].forEach((socket) => {
    socket.once('close', (hadError) => {
      // Maybe you'd like to handle hadError = true in some way?
      disposeSockets(proxySocket, connectorSocket)
    })
    socket.once('error', () => {
      disposeSockets(proxySocket, connectorSocket)
    })
  })

  proxySocket.pipe(connectorSocket)
  connectorSocket.pipe(proxySocket)

  openSockets.add({ proxySocket, connectorSocket })
})

unixProxy.once('error', () => {
  // depending on error, but in general
  openSockets.forEach(({ proxySocket, connectorSocket }) => disposeSockets( proxySocket, connectorSocket))
  unixPoroxy.close()
  unlink(socketPath)
})

unixProxy.once('close', () => sqlConnector.close())

I hope it helps.