Open fritshoogland-yugabyte opened 1 year ago
We need to know how Postgres parsed the statement to construct the bind packet.
If you're concerned about overhead of small queries I would recommend preparing them up front which will be faster than preparing them each time either way.
Thank you for your response. I had the feeling there was some reason for it. Can you expand on that a bit? Why can java do this without requiring PostgreSQL to have parsed it, and be able to send out parse, bind and execute in a single go?
IIRC the Java client string-interpolates all of its query parameters or something.
I fully agree: that for small queries, preparing these upfront will be faster than preparing them each time. And the documentation suggests for queries that are reused to create a prepared statement.
However, generally, many statements are small and are used once (along with a lot of statements are reused over and over), and quite often these do not have any placeholders and variables to bind, wouldn't it for these be possible in the first place and and if so an improvement to push out the extended query phases of parse, bind and execute in one go?
Or is that simply not possible in the current form of the postgres driver, because it still needs the parse result for the bind packet?
Of course it's possible using the rust postgres driver to simply use the query_simple impl, but if possible for simple SQL reducing the number of packets is likely to make a huge difference with bigger deployments.
Without writing a client-side query parser (which I am not interested in doing) we don't have any reliable way to determine if the query has parameters or not.
Check. That is clear.
@sfackler is there a likewise reason every extended query protocol request uses a named statement? Using a named statement the minimal number of roundtrips for a query is 3: parse, bind/execute, close. If no named statement for non-prepared statements is used, it can be reduced by one, because it doesn't need close?
I think that would be possible with a more sophisticated implementation of ToStatement
, yeah.
Is there anything you want me to do about that, such as creating a separate issue?
When no prepared statement is used, it makes sense to use an unnamed statement. The reduction of roundtrips from 2 to 3 is likely significant especially in clouds where the client might not be physically close to the database. Obviously the close for database side prepared statements as a result of a statement being prepared client side makes sense, which is the current implementation.
Probably worth making a separate issue, yeah.
One complicating factor is that use of the unnamed statement will break when pipelining queries. When that happens, you can end up with "prepare 1, prepare 2, execute 1, execute 2", so the execute 1 step would end up using query 2's statement. That can probably be worked around with some extra state tracking to prevent overwrites of the unnamed statement but I don't know what it would look like off the top of my head.
Keep in mind as well that statements aren't closed synchronously - instead it just registers the fact that it needs to be closed in the future, and that'll be folded into the next request if there is one pending.
Can this issue be closed since #1147 was merged?
I investigated the packets that the rust postgres driver sends, and when using the extended protocol, a query executed when using an extended protocol execution method first sends a packet for the parse, and then combines the bind and execution in the next packet.
This is understandable for when using a prepared statement, however when the used query is not a prepared statement, this should be possible to reduce in a singe call? I just looked at the java packets, and JDBC seems to perform this in one packet.
This can mean a significant reduction of overhead for small queries where client response time is mostly composited of network times of flight, and potentially reduces CPU.
I have investigated the query method for as extended protocol execution and extended protocol with prepared statement, and the simple_query method for simple query protocol.