Closed DaveRandom closed 8 years ago
We have implemented in application level
function execAsync($query)
{
if($this->busy)
{
$this->getResult();
}
parent::execAsync($query);
}
But perhaps there is a better solution.
@ParkFramework Actually, execAsync()
currently blocks until the query has been completely sent to the server. It's rare that this is an issue, but it can happen.
Hm, IIRC even the docs say it's merely only an issue with COPY (which is understandable), and our pq\COPY
doesn't do async yet.
It's possible that PQflush() could be folded sanely into Connection#poll() but I'm not sure (haven't fully thought it out).
I think explicit pq\Connection::$nonblocking
and pq\Connection::flush()
are preferable anyway.
Either way, Various Async methods will need to suddenly have a return value where they previously didn't have one, indicating whether the consumer will need to watch the socket for writability and re-flush.
Not necessarily (-: They'd produce an exception currently. And, to be fair, this interface seems pretty meh even on libpq level, quoting docs:
1 is returned if the command was successfully dispatched and 0 if not (in which case, use PQerrorMessage to get more information about the failure).
But PQerrorMessage()
, well, returns a message; you see what I'm up to? :-/
Indeed the cases where this happens will be considerably less common, however there are some places where I can see it being worth having:
I mostly raise this as a userland API discussion - it's not super urgent but it needs thinking about and I am struggling with clarity of thought on it :-)
Oh yes, these are all actually really good reasons as far as I can see.
What do you think about explicitly providing pq\Connection::$nonblocking
and pq\Connection::flush()
?
... and, what about the return value/exception/error message stuff?
Having an explicit Connection#flush()
does make sense and is required I think, we can't magically determine what would need to be done if it were folded into poll()
(see code below). I was struggling with this a little because of the existence of POLLING_WRITING
/POLLING_READING
, which don't actually ever seem to be returned when the underlying poll mechanism is PQconsumeInput()
? The function itself basically returns a bool.
Docs say this about PQflush()
:
Returns 0 if successful (or if the send queue is empty), -1 if it failed for some reason, or 1 if it was unable to send all the data in the send queue yet [...] After sending any command or data on a nonblocking connection, call
PQflush
. If it returns 1, wait for the socket to become read- or write-ready. If it becomes write-ready, callPQflush
again. If it becomes read-ready, callPQconsumeInput
, then callPQflush
again. Repeat untilPQflush
returns 0. (It is necessary to check for read-ready and drain the input withPQconsumeInput
, because the server can block trying to send us data, e.g. NOTICE messages, and won't read our data until we read its.) OncePQflush
returns 0, wait for the socket to be read-ready and then read the response as described above.
Based on that, I think the basic C pseudo program flow goes something like this (it's a little unclear):
if (!PQsendQuery(...)) {
int flushRes = PQflush(...);
while (flushRes) {
if (flushRes == -1) {
/* an actual error occurred, get info about it from PQerrorMessage() and explode */
}
watch_socket_for_read_or_write_ready();
if (socket_is_readable()) {
PQconsumeInput(...);
}
flushRes = PQflush(...);
}
}
Now, obviously in this scenario the wait for the socket to become read- or write-ready
part is done by the application, so all the extension should do in non-blocking mode is call PQflush()
when PQsendQuery()
"fails" and base the throwing logic on that result instead of the initial failure. Connection#flush()
will have a bool return value (to correspond with 1/0 from PQflush()
) and do this underneath:
PHP_METHOD(Connection, flush)
{
int res = PQflush(...);
if (res == -1) {
/* throw with PQerrorMessage() */
}
RETURN_BOOL(!res);
}
Dealing with handling read-readiness by calling poll()
will again be left to the application.
The existing Connection#*Async()
get a bool return value as well, which is generated like so:
if (!PQsendQuery(...)) {
if (!nonBlockingMode) {
/* throw with PQerrorMessage() */
}
int res = PQflush(...);
if (res == -1) {
/* throw with PQerrorMessage() */
}
RETURN_BOOL(!res);
}
RETURN_TRUE;
In blocking mode these functions will always either return true or throw.
Might want to flip res
for the bool return, such that Connection#flush()
returns true
when the query has been completely sent to the server. It could return int that corresponds to a constant, but since it's only ever going to be binary (we throw on error) I don't see the point, personally.
Thoughts?
I was struggling with this a little because of the existence of POLLING_WRITING/POLLING_READING, which don't actually ever seem to be returned when the underlying poll mechanism is PQconsumeInput()?
Oh, yes -- see https://github.com/m6w6/ext-pq/blob/master/src/php_pqconn.c#L1038 -- sorry for this tongue-in-cheek (-:
so all the extension should do in non-blocking mode is call PQflush() when PQsendQuery() "fails" and base the throwing logic on that result instead of the initial failure.
Not sure, if we even have to do that.
Consider (basically, nonpretty):
$c->nonblocking = true;
try {
$c->execAsync("");
} catch(pq\Exception $e) {
if ($e->getMessage() != "not sure yet, this part of libpq seems to suck") {
throw $e;
}
do {
$r=$w=[$c->socket];
if (stream_select($r,$w,$e)) {
if ($r) {
$c->poll();
}
if ($w) {
$flushed = $c->flush();
}
} while (!$flushed || $c->busy);
}
@m6w6 I updated my comment above while you were writing that as I realised I forgot to show what I meant by a return value for *Async() and how the error state is handled in my head.
For the record I'm not a fan of using an exception to detect incomplete writes in userland, for a few reasons but chiefly:
execAsync()
may not be the last thing I wanted to do in a try
block.I basically aggree on the exception and the try/catch madness, but AFAICS the docs say one has to consult PQerrorMessage() to see if a flush is needed?
Ohhhh... looking at libpq source, it already does one flush() in PQsendQuery()
The docs are clear as mud in this regard. However, the possible -1
return value for PQflush()
, along with this unequivocal statement:
After sending any command or data on a nonblocking connection, call
PQflush
...says to me that in non-blocking mode, the return value of PQsendQuery()
no longer has the same meaning (it will, after all, always return an "error" in this mode), and that it is the -1
return state of PQflush()
which should be used to detect actual errors.
Nope, PQsendQuery actually returns 1 if the connection is nonblocking and there's still data to flush, no error.
So... remove the try/catch from by POC above, and you're set.
Bearing in mind that 0
is PQsendQuery()
's error response, this is still going to result in an exception (at the moment).
The key issue is that the result of PQsendQuery()
-based functions becomes tri-state when you introduce non-blocking.
At the moment it's binary - an exception occurs or it doesn't - but in order to have this two-state success and an error state, it's going to need a return value or some other way to get that state out of the call.
Unless I'm missing something?
Sorry, edited my response (I should refrain from that...).
See
The key issue is that the result of PQsendQuery()-based functions becomes tri-state when you introduce non-blocking.
Reading up in the sources clarifies that it doesn't, but the docs totally confused me, too.
Stripping try/catch from my example:
$c->nonblocking = true;
$c->execAsync("");
do {
$r=$w=[$c->socket];
if (stream_select($r,$w,$e)) {
if ($r) {
$c->poll();
}
if ($w) {
$flushed = $c->flush();
}
} while (!$flushed || $c->busy);
This will work but it's not ideal. execAsync()
is still having to determine whether the 0
return value from PQsendQuery()
is an actual error or it just needs more flushes - either do a string comparison (eww) or call PQflush()
again. That's fine, we can deal with that.
However...
An application is forced to registered both read and write watchers on the socket, where the write watcher may not be necessary - in most cases execAsync()
will send the query fully, first time. If the next packet received from the server is ACK/PSH
(or OS buffers cause payload data to be mixed in with the ACK
that makes the socket writable again). An event loop implementation is not guaranteed to fire these watchers in any particular order, and if the write watcher fires first then the application could end up calling flush()
before calling poll()
, which will waste cycles in the app logic (flush()
will not do anything in this case because the input hasn't been consumed yet).
I do understand the desire to avoid a BC breakage by adding a return value, however in this particular case the benefits of doing so outweigh the breakage - no existing code should be using the return value for anything as it has a void
signature.
$c->nonblocking = true;
$flushed = $c->execAsync("");
while (!$flushed) {
// with a return value, this while block can be skipped
// and we never have to worry about writability
$r = $w = [$c->socket];
$e = null;
stream_select($r, $w, $e);
if ($r) {
$c->poll();
}
$flushed = $c->flush();
}
echo "Query sent to server\n";
while ($c->busy) {
$r = [$c->socket];
$e = $w = null;
stream_select($r, $w, $e);
$c->poll();
}
echo "Have results\n";
This is really a matter of efficiency, something that async people worry about a lot.
We cannot detect if PQsendQuery() sent the whole query to the server from its return value.
What I tried to say is that it will still return success even if not the whole output was already flushed when the connection is set nonblocking.
If you follow the links I posted above you can read up on the flow of libpq:
Did not see links until now, reading
Yeah, sorry. That's the confusion that's generated by posting multiple comments and possibly also editing them... That's where a chat comes in handy. Or patience when writing posts. Sorry again.
able to SO chat?
For clarity to future readers of this thread:
http://chat.stackoverflow.com/transcript/message/30587190#30587190
TL;DR
Connection#flush()
directly after calling execAsync()
to get that information.Connection::flush(): bool
will return true
when PQflush()
returns 0
, false
when PQflush()
returns 1
and throw when PQflush()
returns -1
.bool Connection::$nonblocking
will be a wrapper around PQsetnonblocking()
/PQisnonblocking()
.When it's done, please write examples, best practices nonblocking mode pqsocket. Thank.
We'll of course update the documentation!
Thank you.
Duh, it's been a while, but docs have been published: https://mdref.m6w6.name/pq/Connection/flush
http://www.postgresql.org/docs/9.5/static/libpq-async.html#LIBPQ-PQSETNONBLOCKING
The mechanics of this are relatively trivial, but exposing it to userland is less so.
It's possible that
PQflush()
could be folded sanely intoConnection#poll()
but I'm not sure (haven't fully thought it out). Either way, VariousAsync
methods will need to suddenly have a return value where they previously didn't have one, indicating whether the consumer will need to watch the socket for writability and re-flush. This doesn't need to be a BC-break - non-blocking defaults to off so behaviour will not change and the return value can be safely ignored.This has been supported by libpq for some time (8.4 is as far back as I bothered to look) so it won't introduce a new dependency on a higher version or a new layer of
#ifdef
hell.