Open astiob opened 2 years ago
Hi, thanks for the issue.
The big design goal for Skunk is transparent semantics: when things happen in code they happen on the server, so errors happen in the expected place, on the expected fiber (sessions can be used concurrently and we don't want an error on one fiber to be detected on another). So exchanges with the database must be self-contained (i.e., they need to end with ReadyForQuery
, which sometimes requires a Flush
).
Having said that, we're doing more exchanges than we need to do, and aren't caching everything that can be cached:
Parse
and Describe
are a single operation at the API level and can thus be pipelined into a single exchange. In addition, the results can normally be cached and the resulting statement can be reused. Right now we're only caching the Describe
exchange.Bind
and the initial related Execute
are also a single operation at the API level and can thus be pipelined.Once this is implemented, the current 6+ exchanges would be reduced to 2+ from on the first use:
On subsequent use the initial exchange would be unnecessary, so interactions that fetch a single page of results would require only one:
It probably makes sense to pre-fetch the next page of results when processing the result stream, so you can be reading the next page while you're streaming the current page back to the client or whatever.
In any case there is a plan. Right now I'm working on replacing the session queue, which will make caching easier.
When executing a prepared statement, Skunk currently uses the following protocol flow:
This is needlessly inefficient, as it prevents batching of request commands and requires a full network round-trip on every step.
In addition, this currently fails to work—and instead hangs indefinitely—with a Pgpool-II server, which does not fully honour Flush. (This is currently being discussed on the pgpool-general mailing list.)
As far as I can tell, the only reason it is done this way is that
MessageSocket
does not allow queueing up response processing asynchronously and submitting more commands in the meantime. For example,skunk.net.protocol.Parse
currently does this:https://github.com/tpolecat/skunk/blob/9e0744d00f6876f1eda5016ea34d7f0ecac40736/modules/core/shared/src/main/scala/net/protocol/Parse.scala#L38-L55
where
flatExpect
suspends the fibre handling the current database connection until a response message is received. In an ideal world, it would schedule the block to be run whenever the next message arrives, but continue executing the sending fibre so more protocol commands can be queued; so the Flush could be omitted.Any idea as to how this could be allowed in a clean way? I’m thinking that the sending and receiving should be separate fibres, but I’m not well-versed enough to see immediately how to achieve that.