Closed mjl- closed 2 years ago
We are facing similar issue, cancelling the context by http server when user cancels the request triggers driver: bad connection
to be raised. It seems that in the end it causes sql package to re-connect over and over again, until for some reason the connection pool is exhausted (that is most likely caused by different part of the code, maybe even stdlib or our own).
FWIW, I haven't seen unexpected connection pool exhaustion or problematic reconnects.
The workaround I put in my code is to check for the database/sql/driver.ErrBadConn explicitly and treating it similarly to how I treat context.Canceled errors: marking it a user error, not a server error. That saves me getting alerted.
This issue was solved in release 1.10.4 via via https://github.com/lib/pq/pull/1064 ?
Should be!
@otan @mjl- We still have this issue.
I remember from last looking at the code thinking that lib/pq may not be adhering fully to the requirements set by database/sql about when it returns ErrBadConn. You could investigate in that direction...
I have been seeing unexpected "driver: bad connection" (driver.ErrBadConn) errors in my logging. In my case, these are returned by calls to database/sql's DB.QueryRowContext.Scan (and likely also on Tx's) when their context is cancelled. Cancelling contexts happens for me in practice because I pass contexts from http.Request's on to lib/pq. I believe DB.QueryRowContext.Scan should not be made to return driver.ErrBadConn when contexts are cancelled.
I'll add a small reproducer after I've created the issue.
I looks the following ordered events take place in lib/pq:
conn.QueryContext
,conn.query()
returns successfully.conn.watchCancel()
(scheduled inconn.QueryContext
) handles the cancellation, callingconn.setBad()
to mark the connection.rows.Next
(forScan
afterDB.QueryRowContext
) checks the connection withconn.getBad()
and returns a driver.ErrBadConn.I would expect the
Scan
on the sqlRow
to return context.Canceled in this case.Perhaps lib/pq's connections could start to keep track not just whether connections are bad, but also the error to return to future calls on the connection. For expired contexts, the errors would become eg context.Canceled or context.DeadlineExceeded.