Open emilioplatzer opened 9 years ago
1) Is this the way? 2) Why I need to pass client.connectionParameters?
query.on('row',function(row, result){
if(row.field1=='ok'){
result.addRow(row);
}else{
pg.cancel(client.connectionParameters, client, query);
}
});
https://github.com/brianc/node-postgres/wiki/pg#cancelconfig-client-query
In #753 there is an example with client.cancel(query);
but cancel isn't in the documentation.
The cancel code for the native client seems to be completely different from the non-native client. The native simply cancels the query if it is the active one (and removes from queue if not active). The non-native code does some weird stuff with making a connection just to cancel the connection... See also #753
When the PostgreSQL back-end is handling a query on a client it doesn't process any other incoming messages from the socket, so, you can't actually cancel a query from the client that's running the query. Hence having to pass connection parameters & the client you'd like to cancel in.
1) The api for canceling aquery here in node-postgres is still very weird though, I agree. It should probably be made easier to reason about...instead of auto-connecting the best thing is probably have something like this...
var badClient = new Client()
badClient.connect()
badClient.query('pg_sleep(10000000000)')
var goodClient = new Client()
goodClient.cancel(badClient)
Something like that - I'll need to check the client/server protocol documentation to see the details - I know you need the client id of the client you'd like to cancel. Anyways...that might be a bit more explicit and a bit less magical - still cumbersome having to keep around a reference to the old client...but that's how it goes.
2) The native driver doesn't need this because libpq internally does a bunch of this management for you. I just call pqcancel
which internally likely opens a new client & sends a cancel request. Not 100% sure there.
You should be able to cancel queries using pg_cancel_backend()
. I think the pattern looks like this:
const pool = pg.Pool()
/* Check out a client */
pool.connect()
.then((c) => {
/* Get my PID */
c.query('SELECT pg_backend_pid()')
.then((result) => {
const pid = result.rows[0][0]
const badResult = c.query('SELECT pg_sleep(10000000000)')
/* Kill this after 1 second */
setTimeout(() => {
/* NOTE this query is executed by the pool */
const cancelResult = pool.query('SELECT pg_cancel_backend($1)', [ pid ])
}, 1000)
})
})
Is it OK to use the existing APIs? I'm considering to add type definitions to https://github.com/Microsoft/DefinitelyTyped and would like to confirm the intention before doing that.
interface ClientBase {
cancel(client: ClientBase, query: Query): void;
}
interface NativeClientBase {
cancel(query: Query): void;
}
interface Connection {
cancel(processId: number, secretKey: number): void;
}
@akihikodaki The existing APIs don’t actually work right now. See #1392.
@charmander Thanks for your reply. But now I have been realized ClientBase.cancel
works, and NativeClientBase.cancel
may also work. I wonder if those low level interfaces can be used for now.
I'm also looking for a way to cancel statement, is there any way in 2019? Is the Client.cancel method supported? If so, there seems to be no types for it.
@pankleks I have shared here what such an implementation would look like.
https://github.com/gajus/slonik/issues/38
There is no one readily available, though in node-postgres.
Is there any way in 2020? 😄 😭
heh sorry this isn't available exactly right now - I'll add it to the post 8.0 release roadmap and get to it as soon as I can. I have support for it at the protocol level, just don't have it exposed right now. It's a bit of a cumbersome operation in that you need to connect with a separate client to cancel the query from another one so you need a few things
1) the unique id of the client running the query you want to cancel (this is available) 2) the ability to connect a new client to your backend 3) handling all the errors both with connecting the new client, issuing the cancel call, and disconnecting the new client.
There are some implications on what this means for strictly controlled pool sizes as well...e.g. if you only ever want n
number of clients under all circumstance and your pool is exhausted what do you do? That could be left up to the consumer to do though....I'm thinking the API would probably need to look something like this:
const client = await pool.connect()
client.query('SELECT pg_sleep(1000000)')
const otherClient = await pool.connect()
otherClient.cancel(client)
Outstanding question: does the exiting client running the query reject with an error? I'd assume so - so there'd be some tricky error handling in that interaction. Open to feedback here.
if you only ever want n number of clients under all circumstance and your pool is exhausted what do you do?
I think this should be left up to the consumer for now, that would be simplest. That said, what would happen if you closed client
in your example before requesting another client from the pool?
Outstanding question: does the exiting client running the query reject with an error? I'd assume so - so there'd be some tricky error handling in that interaction. Open to feedback here.
Postgres should response with a query error to that client, and I would assume all the existing error handling would catch that as normal. What's the tricky interaction?
https://www.postgresql.org/docs/9.3/protocol-flow.html#AEN100004
The way libpq
does it is they just open a new connection w/ the cancel request. I'll noodle on this a bit after I get 8.0 out the door & get it working. Perhaps it does just look like client.cancel()
as the operation is a bit silent/mysterious when libpq does it.
Postgres should response with a query error to that client, and I would assume all the existing error handling would catch that as normal. What's the tricky interaction?
yeah you're right, the existing query will reject, the cancel call never issues a response and the client which issued it is closed immediately by postgres if I'm reading things right so...yeah i'll investigate a bit.
const client = await pool.connect()
client.query('SELECT pg_sleep(1000000)')
const otherClient = await pool.connect()
otherClient.cancel(client)
About this API - does the client have a clientID property? If so, this is only one line away from the solution available now:
const client = await pool.connect()
client.query('SELECT pg_sleep(1000000)')
const otherClient = await pool.connect()
otherClient.query('SELECT pg_cancel_backend($1)', [ client.clientId ])
Just making the client ID available as a property of the client might be enough to satisfy the users without introducing too much complexity.
Here's a POC
Just making the client ID available as a property of the client might be enough to satisfy the users without introducing too much complexity.
yeah the clientId
is available as a property currently
Thoughts on:
pool.cancel(client)
and
pg.cancel(clientOptions, client)
// or, given what pg.Client currently represents,
new Client(clientOptions).cancel(client)
? The pool version is convenient, but a bit misleading since it wouldn’t be able to use a pooled client. (Unless it can and that option is documented. I still haven’t checked, haha. See comments in #1392.)
? The pool version is convenient, but a bit misleading since it wouldn’t be able to use a pooled client.
yeah you're right - my initial api ideas I threw out dealing w/ existing clients is actually not valid as doing a cancel requires 1) a new connection 2) the connection be immediately closed after the cancel is received. I think the implementation I did in #2084 is mostly the right way. I didn't concern myself w/ "query queue" stuff as that's an anti-pattern I want to deprecate and it will confuse things when it comes to proper query pipelining which is coming up.
Awesome to see movement on this!
My use case is that when an HTTP request is cancelled (socket closed), I want to cancel all data loading that was kicked off. Every request takes its own client from the pool (or uses pool.query()
. Now a query may have finished already or may need to be cancelled. From that perspective, it would be nice if the cancel API is idempotent (a noop if the query already returned). And it would need to be ensured that it is not possible to accidentally cancel an in-progress query on the client because it was already returned to the pool and is used by a new request.
Outstanding question: does the exiting client running the query reject with an error? I'd assume so - so there'd be some tricky error handling in that interaction. Open to feedback here.
It would have to throw/reject with an error in the Promise-returning query API. The Promise must be resolved so finally
blocks are run and to not leak memory, and just returning/fulfilling with undefined
would break the interface contract and have calling code continue as usual, expecting a result.
This should be an error with a stable code, e.g. fetch()
throws an error with err.name === 'AbortError'
if a request was aborted. That way it can be handled specifically (e.g. making sure it's not accidentally caught in catch
blocks).
The approach proposed in #2084 has a flaw in that if you run out of connection slots, you won't be able to cancel the query. I have experimented a bit with this here. https://github.com/gajus/slonik-utilities/issues/4 I ran out in a similar problem when I had low maxConnection count and parallelised cancellable queries.
I'd say that for this to be implemented safely, you should start the governing connection before starting the cancellable query. This way the query will always be cancellable safely. Connections are expensive though.
start the governing connection before starting the cancellable query
This is VERY expensive, especially in a scenario where resources are limited.
A better approach would be to have a reserved pool for cancelling connections:
// My platform has allocated me 20 connections
const pool = new Pool({max: 19})
const adminPool = new Pool({max: 1})
const client = await pool.connect()
client.query('SELECT pg_sleep(1000000)')
const otherClient = await adminPool.connect()
otherClient.cancel(client)
// - or, preferably: -
await adminPool.cancel(client) // like Pool.query
Whether you want the user to instantiate the admin pool, or allow them to specify the number of reserved connections, I think either is fine. I expect the latter would be quite a significant refactor, to have Pool manage two logical pools.
It seems like the logic of cancel can be pretty complex. The backend can be in essentially 3 states:
cancelled
which means either backend pid is already killed, or select state from pg_stat_activity where pid = $1
is idle
active
which means select state from pg_stat_activity where pid = $1
is active
or fastpath
waiting
which means select state from pg_stat_activity where pid = $1
is idle in transaction
or idle in transaction (aborted)
Each of these states needs different treatment
cancelled
- return immediatelyactive
- need to select pg_cancel_backend($1)
with that pid which can transition to either waiting
or cancelled
waiting
- need to fetch the original client by pid and run client.query('rollback')
on it, this way it can transition to cancelled
More than that both active
and waiting
need to re-run the logic after some timeout in case it did not work. And if the state did not change run select pg_terminate_backend($1)
Example pseudocode https://gist.github.com/pkit/58887bb0c5c13c8edacc5fd0cb05c53b
@pkit I’m not sure exactly what the approach you’re describing is and isn’t supposed to handle, but I don’t think it’s the job of the driver to do anything other than send a cancellation message (the equivalent of pg_cancel_backend(pid)
, but with a key) either way.
@charmander why?
@charmander why?
Because it is opinionated implementation.
Use any one of the existing higher level abstractions.
Can I cancel a query inside query.on('row'...?
I want to do something like:
The need is to not continue fetching rows if I detect an error condition.