typelevel / skunk

A data access library for Scala + Postgres.
https://typelevel.org/skunk/
MIT License
1.56k stars 157 forks source link

Inefficient use of Flush in extended query protocol #605

Open astiob opened 2 years ago

astiob commented 2 years ago

When executing a prepared statement, Skunk currently uses the following protocol flow:

  1. Parse
  2. Flush
  3. Await response and check for errors
  4. Describe statement
  5. Flush
  6. Await response and check for errors
  7. Bind
  8. Flush
  9. Await response and check for errors
  10. Execute
  11. Flush
  12. Await response and check for errors
  13. Close portal
  14. Flush
  15. Await response and check for errors
  16. Close statement
  17. Flush
  18. Await response and check for errors

This is needlessly inefficient, as it prevents batching of request commands and requires a full network round-trip on every step.

In addition, this currently fails to work—and instead hangs indefinitely—with a Pgpool-II server, which does not fully honour Flush. (This is currently being discussed on the pgpool-general mailing list.)

As far as I can tell, the only reason it is done this way is that MessageSocket does not allow queueing up response processing asynchronously and submitting more commands in the meantime. For example, skunk.net.protocol.Parse currently does this:

https://github.com/tpolecat/skunk/blob/9e0744d00f6876f1eda5016ea34d7f0ecac40736/modules/core/shared/src/main/scala/net/protocol/Parse.scala#L38-L55

where flatExpect suspends the fibre handling the current database connection until a response message is received. In an ideal world, it would schedule the block to be run whenever the next message arrives, but continue executing the sending fibre so more protocol commands can be queued; so the Flush could be omitted.

Any idea as to how this could be allowed in a clean way? I’m thinking that the sending and receiving should be separate fibres, but I’m not well-versed enough to see immediately how to achieve that.

tpolecat commented 2 years ago

Hi, thanks for the issue.

The big design goal for Skunk is transparent semantics: when things happen in code they happen on the server, so errors happen in the expected place, on the expected fiber (sessions can be used concurrently and we don't want an error on one fiber to be detected on another). So exchanges with the database must be self-contained (i.e., they need to end with ReadyForQuery, which sometimes requires a Flush).

Having said that, we're doing more exchanges than we need to do, and aren't caching everything that can be cached:

Once this is implemented, the current 6+ exchanges would be reduced to 2+ from on the first use:

On subsequent use the initial exchange would be unnecessary, so interactions that fetch a single page of results would require only one:

It probably makes sense to pre-fetch the next page of results when processing the result stream, so you can be reading the next page while you're streaming the current page back to the client or whatever.

In any case there is a plan. Right now I'm working on replacing the session queue, which will make caching easier.