`Async::IO::Stream#write` doesn't flush to IO even when `sync` is `true`

maruth-stripe commented 9 months ago

What the title says.

                         # Writes `string` to the buffer. When the buffer is full or #sync is true the
            # buffer is flushed to the underlying `io`.
            # @param string the string to write to the buffer.
            # @return the number of bytes appended to the buffer.
            def write(string)
                @write_buffer << string

                if @write_buffer.bytesize >= @block_size
                    flush
                end

                return string.bytesize
            end

In contrast to the documentation, the method only flushes when the write_buffer is full. This is what's causing the tests to hang in my PR to protocol-http2.

Specifically here in test/protocol/http2/client.rb it sends the connection preface, but that is too small to fill up the write buffer so it never gets flushed. Hence the following framer.read_connection_preface indefinitely hangs.

Changing the Async::IO::Stream#write to

                         # Writes `string` to the buffer. When the buffer is full or #sync is true the
            # buffer is flushed to the underlying `io`.
            # @param string the string to write to the buffer.
            # @return the number of bytes appended to the buffer.
            def write(string)
                @write_buffer << string

                if @io.sync || @write_buffer.bytesize >= @block_size
                                 #  ^^^^^^^^^^^ <-- the change
                    flush
                end

                return string.bytesize
            end

fixes it, and the test suite works. Is the unconditional buffering intended?

maruth-stripe commented 9 months ago

An alternative would be maintaining the current behavior for Async::IO::Stream but changing the code in protocol-http2 to explicitly flush when it writes small things that need to be sent immediately (such as connection preface etc)

ioquatix commented 9 months ago

Apologies for the delayed response, I've been dealing with a bunch of bug reports from the Ruby v3.3.0 release and ensuring we backport all the appropriate fixes that are on my radar.

So, I've been thinking about this problem and want to elaborate a couple of points.

I'd really like to deprecate async-io the gem. The reason is, this gem was designed for Async v1.x and provides shims/wrappers which are not needed on Ruby v3.1+ + Async v2. The goal is to slowly move away from Async v1.

To this end, I've been:

Moving bespoke functionality from async-io that turned out to be a good idea, directly into Ruby, e.g. Async::IO#timeout -> IO#timeout.
Rewriting specific parts of Async::IO that are generic enough and useful enough to stand on their own, e.g. Async::IO::Endpoint -> IO::Endpoint https://github.com/socketry/io-endpoint.
Rewriting gems that depend specifically on async-io (usually Async::IO::Stream and/or Async::IO::Endpoint) to use IO instances directly, e.g. https://github.com/socketry/async-http/pull/147

I'm okay to fix things here in async-io, especially if it aligns the interface with native io (and vice versa with PRs to CRuby), however one key part of the PRs you've been working on is the assumpion we have IO#peek(n) which doesn't exist on CRuby.

I'm also certain we want to be careful around buffering IO, and explicit #flush is usually better (but not always). Carefully tuning the HTTP/1 implementation to flush at the right times significantly improves latency in my micro-benchmarks (and probably more generally).

w.r.t. HTTP/2, I think flush only makes sense when you are expecting to read a response, or need to reduce round-trip latency (i.e. send this message right away). Regarding the connection preface, it's probably not a bad idea to flush it right away.

Sorry for the brain dump, but:

Let's consider explicit #flush where it makes sense.
Let's figure out if we can avoid using Async::IO::Stream#peek(n) OR consider upstreaming that interface to IO proper - both are reasonable to me, but it would take until Ruby v3.4.0 before we can use it... so a compatibility shim or separate implementation might be required.

zarqman commented 7 months ago

I just took a look at the flush situation and it appears that in normal, non-test usage, both send_connection_preface and read_connection_preface call read_frame, which in turn eventually calls read -> fill_read_buffer, which calls flush.

However, if send_connection_preface is given a block, as used in tests, then the block executes before the indirect flush gets called, hence the hanging behavior described here.

For this, I think a conditional flush when there's a block would address it.

# protocol-http2's client.rb

def send_connection_preface(settings = [])
  if @state == :new
    @framer.write_connection_preface

    send_settings(settings)

    ### begin: changes
    if block_given?
      # I believe @stream is only valid in async-http's subclass, so may need to be @framer.stream.flush
      @stream.flush
      yield
    end
    ### end: changes

    read_frame do |frame|
      raise ProtocolError, "First frame must be #{SettingsFrame}, but got #{frame.class}" unless frame.is_a? SettingsFrame
    end
  else
    raise ProtocolError, "Cannot send connection preface in state #{@state}"
  end
end

@maruth-stripe, does that seem like it would address the issue as you understand it?

PS: This doesn't address the question or usefulness of adding peek. I'd personally like to see peek kept (upstreamed or otherwise), as it's relevant not only to socketry/protocol-http2#15, but also to my own PR in socketry/async-http#128.

ioquatix commented 7 months ago

Sorry, I realise there is a bunch of stuff to unpack here, I'll try to circle back to it soon.

socketry / async-io

`Async::IO::Stream#write` doesn't flush to IO even when `sync` is `true` #79