Closed wangxiaoying closed 1 year ago
Some questions:
- Why does this behavior only need to be changed for copy_out?
That's a good point. I do think modify encode directly should work and could unblock other methods that also wait for BindComplete
like query_raw
. But I only tested on copy_out
currently in my own project. Do you think I can update execute
directly?
- Do other clients (libpq, jdbc-postgres, etc) include this extra flush?
I don't have experience in those clients. But I saw in postgres document that:
A Flush must be sent after any extended-query command except Sync, if the frontend wishes to examine the results of that command before issuing more commands.
I think it matches in our case and matters to the the example scenario above.
I'd prefer to match the flushing behavior of existing clients, whatever that is. I'm not convinced that the overhead of splitting the results of every query across a flush balances against the benefit of this specific use case.
Context
Here is an example scenario:
It works great when the query is simple (e.g., select * from a table). However, when it gets complex (like this tpc-h query). Getting the reader will block for several seconds.
The reason for this is that currently copy_out needs to wait for receiving
BindComplete
andCopyOutResponse
message to return. These two messages are indeed extremely fast to generate in postgres, but since there is no manualflush
logic, they are not returned to the client until postgres generate some execution result to the buffer and make it reach to the flush threshold. Therefore, when query becomes complex and need more time to produce the first few result tuples, getting theCopyOutReader
will be blocked.The issue caused by this is that it will extend the end-to-end time for this example. Since the total time used before we retrieve the result would be time(postgres execution) + time(logic X).
Propose solution
A simple solution is to add a
Flush
message afterBind
in order to getBindComplete
immediately. And delay the receiving forCopyOutResponse
to result retrieval. This will unblockcopy_out
and make the above example from time(postgres execution) + time(logic X) to MAX(time(postgres execution), time(logic X)).Here is a PR for this: #971. Please let me know if I missed anything, thanks!