Closed amotl closed 3 years ago
What @emilsedgh figured out and reported at https://github.com/brianc/node-postgres/issues/1872#issuecomment-720065853 about pg-protocol
's Parser.parse()
method sounds sensible.
However, I am not any way into any of the details which would permit me to have any judgement on this. Specifically if that potential flaw only reveals a problem with respect to error handling or if it also might be responsible for the statements being submitted out of order. Just FYI.
Dear Brian,
thank you very much for your prompt response and your investigations at https://github.com/brianc/node-postgres/issues/2455#issuecomment-769345077 which revealed a problem to us, where CrateDB was not responding correctly to PostgreSQL wire protocol requests. Sorry for the noise.
@mfussenegger (also thanks a stack!) added a mitigation on behalf of https://github.com/crate/crate/pull/10979:
If the
flush
triggers the deferred executions, then a follow upsync()
returned a completed future. This led thepg-write
layer to immediately send aReadyForQuery
message to the client, even if the operations triggered by the flush weren't finished.This patch improves the logic so that
flush
sets theactiveExecution
property, which will then be used bysync
. This ensures that theReadyForQuery
message is sent after the operations have completed.
I have been able to confirm that the repros now work flawlessly using the most recent nightly release of CrateDB. The fix will be part of the upcoming GA release 4.4.1.
With kind regards, Andreas.
So, what @emilsedgh figured out and reported at https://github.com/brianc/node-postgres/issues/1872#issuecomment-720065853 about pg-protocol
's Parser.parse()
method, is probably a different kind of story and might be followed up independently. Sorry Emil!
Hi Brian,
coming from #2454,
check-query-order-node-postgres.js
is a repro in order to trigger an async/timing issue withnode-postgres
. It eventually reveals a problem with in-order statement execution when using parameterized statements on the pure-JS driver variant.The documentation how to run it is inlined into the program (
check-query-order-node-postgres.js --parameterized
). As I've mentioned within #2454, the problem does not happen when usingpg-native
which you can conveniently enable by using the--native
command line option.From running it, you might be able to see that - while the SQL statements are submitted in order and the
await
ing of statements is hopefully correctly implemented - asking for data which has just been inserted into the database will yield an empty result. From inquiring thesys.jobs_log
table afterwards, we can see that the statements are not executed in order. The right order should obviously beDROP/CREATE
,INSERT
,REFRESH
,SELECT
.When executing the repro just once, we have only been able to reproduce the issue on a freshly started instance of CrateDB. That's probably because it initially responds a tick slower than after being warmed up. All subsequent invocations succeeded, that's why I first thought it would be a regression on CrateDB. However, the issue can be revealed again when running
check-query-order-node-postgres.js
in a loop by using thewatch
utility as outlined within the inline documentation.To be absolutely safe, I accompanied the repro with a respective example program using Python/asyncpg
check-query-order-python-asyncpg.py
, which does not reveal any flaw.Below, I've shared details of the tracebacks. The issue has already been discovered by @Page- and reported through #1872, @johan13, @laurisvan and @emilsedgh also have been able to confirm this to happen occasionally on their environments.
Thanks already for looking into this!
With kind regards, Andreas.
cc @sroze, @robobr, @charmander, @ceejbot
Besides the statements not being in order and receiving an empty result (see stdout output above), this exception also happens occasionally.
```javascript /path/to/node_modules/pg/lib/client.js:360 this.activeQuery.handleCommandComplete(msg, this.connection) ^ TypeError: Cannot read property 'handleCommandComplete' of null at Client._handleCommandComplete (/path/to/node_modules/pg/lib/client.js:360:22) at Connection.emit (events.js:315:20) at /path/to/node_modules/pg/lib/connection.js:115:12 at Parser.parse (/path/to/node_modules/pg-protocol/dist/parser.js:40:17) at Socket.TypeError: Cannot read property 'handleCommandComplete' of null
Occasionally, the repro will yield the following traceback. While it does not originate from within
node-postgres
but rather from our repro program, it is also coming from a not-in-order execution of the queries as it clearly indicates that thesys.jobs_log
table has either been queried before any other statements could have been recorded or, alternatively, the response handling ofnode-postgres
has a flaw.
```javascript Trace: Error: sys.jobs_log has no content at /path/to/crate-qa/tests/client_tests/node-postgres/check-query-order-node-postgres.js:71:13 at processTicksAndRejections (internal/process/task_queues.js:93:5) at async workload (/path/to/crate-qa/tests/client_tests/node-postgres/check-query-order-node-postgres.js:69:3) at /path/to/crate-qa/tests/client_tests/node-postgres/check-query-order-node-postgres.js:111:13 at processTicksAndRejections (internal/process/task_queues.js:93:5) ```Error: sys.jobs_log has no content