sfackler / rust-postgres

Native PostgreSQL driver for the Rust programming language
Apache License 2.0
3.42k stars 436 forks source link

Refactor `tokio_postgres::Connection` implementation #1066

Closed 50U10FCA7 closed 1 month ago

50U10FCA7 commented 11 months ago

Minor changes improving tokio_postgres::Connection implementation:

sfackler commented 11 months ago

What is the purpose of these changes?

50U10FCA7 commented 11 months ago

@sfackler Thanks for the quick reply.

These changes should allow to process something like this more efficiently:

let (client, connection) =
        tokio_postgres::connect(..).await?;

let rc_client = Rc::new(client);

tokio::spawn_local(move || {
    rc_client.clone().query(..)
})

tokio::spawn_local(move || {
    rc_client.clone().query(..)
})

without blocking Connection polling on specific request/response from Client.

Also, now Connection::stream::poll_flush is waited by Connection to return Poll::Ready (to ensure the msg was written).

Private docs updated to make code a little bit readable.

p.s. All changes are minor and doesn't break the public API, only internal implementation is changed.

sfackler commented 11 months ago

without blocking Connection polling on specific request/response from Client.

What blocking is happening before this change?

Also, now Connection::stream::poll_flush is waited by Connection to return Poll::Ready (to ensure the msg was written).

In what context would a message not be written?

50U10FCA7 commented 11 months ago

@sfackler

What blocking is happening before this change?

poll_read on master branch can process only one response at time, even if socket is already received messages of the next responses:

But there can be a situation when socket already has a full response (all its messages) and some messages of another one. In this case we can try to send a part of the second response within current poll:

I think this should be more efficient, because we don't have a guarantee that runtime will poll the Connection again right after the current poll. And because we not poll the socket again (we are waiting for a Client's receiver), the next time Connection::poll_read will be polled by runtime only if:

  1. A new Client's request received;
  2. When Client's receiver is dropped/able to receive a response.

If we assume that Response's receiver is not polled for a long time (doing more important things in futures::future::select(client.query(..), another_fut) for example) and we have another one Response that is received by socket and ready to be send - we should do it I think.

In what context would a message not be written?

In case when Framed cannot write a frame to a socket (I think its possible if machine has a small window and full frame cannot be written, but not sure it is possible in real-world scenarios).

p.s. Also found that both TcpStream and UnixStream supports vectored write, but tokio_postgres::Socket doesn't use it, is there any reason for that?

sfackler commented 11 months ago
  • receive all available BackendMessages;
  • try to send all messages of the pending responses within current poll.

A sufficiently fast server could cause that logic to OOM by giving it an unbounded amount of messages to buffer.

I think this should be more efficient, because we don't have a guarantee that runtime will poll the Connection again right after the current poll.

Efficiency claims should be measured, not assumed.

In case when Framed cannot write a frame to a socket (I think its possible if machine has a small window and full frame cannot be written, but not sure it is possible in real-world scenarios).

The next poll of the connection will continue to drive the flush.

p.s. Also found that both TcpStream and UnixStream supports vectored write, but tokio_postgres::Socket doesn't use it, is there any reason for that?

Because there aren't any vectored writes happening in the client AFAIK.