Closed alexheretic closed 1 year ago
By writing the question _"Should pollready try to flush if it is !ready?" I've mostly convinced myself that it should.
~Someone stop me or I'll add that to this PR in a bit...~
Update: I did that and updated the PR description.
It looks like a quick fix, but it feels a bit risky as we need to make many assumptions about how tungstenite
should buffer the messages. I'm also not sure if there are no edge cases that we need to consider, e.g. the latest changes in tungstenite
allow specifying how much data would be buffered and not written to the underlying stream before hitting a limit, which theoretically means that if a size of that limit/threshold being close to the size of our buffer, we can get to the point where N feed()
s (i.e. poll_ready()
+ start_send()
) called in a row would simply buffer N messages filling up the internal out buffer in tungstenite
without writing anything to the stream causing the N+1 attempt to call feed()
to return an error telling that the buffer is full.
I don't think there is an elegant way to fix this issue (surely we could add some if
s comparing the threshold value ad the out buffer with the frame size, but I'm not sure if it's really worth the effort and if it introduces too many complexities).
Personally, I think that if we really want to have the maximum flexibility for the buffering/flushing, it would be better to introduce explicit methods for that and keep the former implementation of the Sink
(that seem to have always behaved in a correct [albeit inefficient] way).
I think it is important that feed
in general does not flush. This is the reason feed
and send
exist. Having both of them always flush seems just incorrect and a performance bug. We don't need to do that anymore. It seems a bit silly to have feed
behave incorrectly in order to ensure it behaves ... correctly.
To be clear the reason for feed
to exist is:
Unlike send, the returned future does not flush the sink.
I don't see the configurable write_buffer_size
& max_write_buffer_size
as an edge case, it is just a buffering behaviour within the general contract of a write interface. If the user sets the max_write_buffer_size
low enough they may see an error. A simpler example, if a user sets the max_write_buffer_size
smaller than the size of a message all feed
attempts will fail no matter how you flush. Can you explain why this breaks the Sink
contract?
As I see it now this behaviour looks correct. feed
generally won't flush, but now will try to flush
when writes block. This matches the Sink
docs suggestions for behaviour.
Implementations of poll_ready and start_send will usually involve flushing behind the scenes in order to make room for new messages. It is only necessary to call poll_flush if you need to guarantee that all of the items placed into the Sink have been sent.
All other tungstenite errors are simply returned to the user to deal with. That isn't a change from existing behaviour. So the only interesting error is WouldBlock
since it is not returned to the user. And that was the issue with https://github.com/snapview/tokio-tungstenite/pull/284 that it didn't handle this hidden error any differently from the flush happy past.
This PR addresses that by flushing after WouldBlock
errors. That means in general feed
will not flush (which is what it exists to not do) but if the underlying stream starts blocking it will try to flush meaning we'll either flush or return Pending
and reflect the blocking properly. I still think this PR seems a good fix for the issue.
I don't see the configurable write_buffer_size & max_write_buffer_size as an edge case, it is just a buffering behaviour within the general contract of a write interface. If the user sets the max_write_buffer_size low enough they may see an error. A simpler example, if a user sets the max_write_buffer_size smaller than the size of a message all feed attempts will fail no matter how you flush. Can you explain why this breaks the Sink contract?
If I set max_write_buffer_size
to a value that is in between the size of my messages and the size at which we start seeing WouldBlock
errors, what happens if I call feed()
repeatedly without flushing? I think (just from reading the changes, not trying it, so could certainly be wrong!) that:
self.ready = false
(because we haven't sent enough data to start seeing WouldBlock
), so poll_ready()
always returns Ready
max_write_buffer_size
, we'd get a failure in start_send
This seems to break the contract of feed
, right? I would expect to be able to feed()
continuously and the Sink
should flush as needed, which wouldn't happen in this case. Maybe my configuration is questionable, but it seems like it should be valid? I've based my max_write_buffer_size
on the size of my messages (and ensure it's large enough to hold some number of them), but because it's smaller than some underlying value I don't know I hit this edge case.
Firstly thinking about hypotheticals, max_write_buffer_size
should never be set at or below write_buffer_size
, its should always at least write_buffer_size + 1 message
. We should probably add some basic errors around this upstream. But in the following I'll assume these two are set to reasonable things or default.
We would never set self.ready (because we haven't sent enough data to start seeing WouldBlock), so poll_ready() always returns Ready
If you run your example you'll see this isn't the case. That's because the write-buffer isn't filling up. If the underlying stream is not returning WouldBlock
(or any other error that you would see) that means the writes are going into the underlying stream and not building up in tungstenite's write-buffer.
So that is the aim of this PR that you can call feed
continuously and flush will happen after WouldBlock
as needed, no build up of messages in tungstenite's write-buffer over the write_buffer_size + 1 message
.
I've based my max_write_buffer_size on the size of my messages (and ensure it's large enough to hold some number of them)
As mentioned before it if bounded the max should always be at least write_buffer_size + 1 message
. I actually don't see a huge benefit to bounding at all it unless you aim to ignore (or just log) most errors and only handle WriteBufferFull
.
The important point is the write-buffer only fills up past write_buffer_size
when the underlying stream writes are failing.
I think I have clearly not documented it well enough upstream :( Here's what I have currently, let me know if you have suggestions!
write_buffer_size: usize
The target minimum size of the write buffer to reach before writing the data to the underlying stream. The default value is 128 KiB.
Note: flush will always fully write the buffer regardless.
max_write_buffer_size: usize
The max size of the write buffer in bytes. Setting this can provide backpressure in the case the write buffer is filling up due to write errors. The default value is unlimited.
Note: Should always be set higher than write_buffer_size.
Update: I've tried to clarify the docs further based on this discussion: https://github.com/snapview/tungstenite-rs/pull/361
imo the approach in this pr seems good, I've tried to explain why as clearly i can. Are there remaining concerns @daniel-abramov?
imo the approach in this pr seems good, I've tried to explain why as clearly i can. Are there remaining concerns @daniel-abramov?
Apologies for the long delay.
Let's give it a try, thanks for explaining your point!
With this PR
poll_ready
continues to generally returnReady
. However, afterstart_send
encountersWouldBlock
poll_ready
will instead try to flush.When poll_ready->flush returns
Ready
poll_ready will revert to just returningReady
... until the nextWouldBlock
.This mechanism prevents the case tokio-tungstenite users filling up the tungstenite write-buffer without actually handling any errors (since
WouldBlock
errors are not returned to the user).We also maintains
feed
generally not flushing as it will only do so after encounteringWouldBlock
errors. So should perform well.See https://github.com/snapview/tokio-tungstenite/issues/286#issuecomment-1595678753.
Resolves #286
Questions/alternatives
WebSocketConfig::write_buffer_size
. I think this would achieve the same thing though it would require some changes upstream to expose the current write-buffer size. I'm not sure it's actually better than just tracking would-block as this PR does though.poll_ready
simply returnPending
afterWouldBlock
but trying to flush in this case seems more correct & useful, plus seems to fit with theSink
docs