Testing revealed that pdb streaming queries definitely do continue running for a while after client disconnection. Long after the client disconnected, jetty would continue to call the query-eng/generated-stream read to stream the encoded json bytes which in turn keep the database query running. The current assumption is that the reads are caused by buffering before the next network write. A stack trace from the read reveals that there's a gzip handler, but since (at least from a cursory investigation) that handler appears to rely on a Deflate instance with a 512 byte buffer (fixed?), it's probably not the key reason for the continuing reads.
In any case, after further investigation, we didn't find a straightforward way to detect a client disconnection in the ring or tk-jetty9 world, but since the jetty/servlet response handler is given a ServletResponse object, which for jetty is an org.eclipse.jetty.server.Response, I wondered if that might provide access to the client connection. It does.
Jetty (9 at least) handles the connection via a chain of "interceptors" attached to the HttpOutput, which is itself attached to the Response. For a trivial jetty app there's just one interceptor, while pdb includes others (the gzip "layer" at least). The jetty HttpOutput Interceptor comments say that the last link in the chain will be an HttpChannel, and for pdb, that's currently an HttpChannelOverHttp instance, and that allows direct access to the jdk SocketChannel connected to the client.
Fortunately the client channel is also in non-blocking mode, and so even though the client isn't (and shouldn't be) sending any data, an attempt to read from the client channel will return 0 as long as the client is connected and -1 after it disconnects (verified via curl against both the simple test app and pdb itself). It's then straightforward to add a check in pdb's query-eng/body-stream's stream-rows that will cause pdb to abandon the query whenever the read indicates that the client is gone.
This approach relies on some assumptions:
The jetty transport is non-blocking.
The read from the client always returns -1 after a disconnect.
By the time pdb is streaming, it's acceptable to attempt to read bytes from the client, i.e. either the client won't be sending any bytes, or it's OK to discard them.
It would also require a change to trapperkeeper-webserver-jetty9, because ring handlers (both in ring itself and in tk-jetty9) don't provide the response instance to the handler, just the request map. One option would be to add an :include-response? option to the add-ring-handler service method. When true the handler call would become (handle request-map response) instead of (handle request-map).
Testing revealed that pdb streaming queries definitely do continue running for a while after client disconnection. Long after the client disconnected, jetty would continue to call the query-eng/generated-stream read to stream the encoded json bytes which in turn keep the database query running. The current assumption is that the reads are caused by buffering before the next network write. A stack trace from the read reveals that there's a gzip handler, but since (at least from a cursory investigation) that handler appears to rely on a Deflate instance with a 512 byte buffer (fixed?), it's probably not the key reason for the continuing reads.
In any case, after further investigation, we didn't find a straightforward way to detect a client disconnection in the ring or tk-jetty9 world, but since the jetty/servlet response handler is given a ServletResponse object, which for jetty is an org.eclipse.jetty.server.Response, I wondered if that might provide access to the client connection. It does.
Jetty (9 at least) handles the connection via a chain of "interceptors" attached to the HttpOutput, which is itself attached to the Response. For a trivial jetty app there's just one interceptor, while pdb includes others (the gzip "layer" at least). The jetty HttpOutput Interceptor comments say that the last link in the chain will be an HttpChannel, and for pdb, that's currently an HttpChannelOverHttp instance, and that allows direct access to the jdk SocketChannel connected to the client.
Fortunately the client channel is also in non-blocking mode, and so even though the client isn't (and shouldn't be) sending any data, an attempt to read from the client channel will return 0 as long as the client is connected and -1 after it disconnects (verified via curl against both the simple test app and pdb itself). It's then straightforward to add a check in pdb's query-eng/body-stream's stream-rows that will cause pdb to abandon the query whenever the read indicates that the client is gone.
This approach relies on some assumptions:
It would also require a change to trapperkeeper-webserver-jetty9, because ring handlers (both in ring itself and in tk-jetty9) don't provide the response instance to the handler, just the request map. One option would be to add an :include-response? option to the add-ring-handler service method. When true the handler call would become (handle request-map response) instead of (handle request-map).