Open elenaf9 opened 1 year ago
Yeah, sounds okay to me! I think the solution where finish only returns an error if some data was not acked sounds most elegant.
There are subtle hazards to assigning application-layer meaning to transport-layer ACKs. An endpoint ACKing a packet does not mean that the application layer has seen or will ever see the data borne in it. For example, your receiver might stop the stream prematurely, then receive packets that finish it, then acknowledge those packets. Packet loss or reordering might even cause the sender to receive those acknowledgements before the STOP_SENDING
, which would look identical to a fully read stream. Similarly, if the receiver stops after reading all expected data, the sender might still see the STOP_SENDING
before all relevant ACKs. Even now, finish
succeeding is not a guarantee that the peer never called stop
.
What does your application-layer protocol need to distinguish these cases for? If there's an error that must be communicated, could that perhaps be done instead by sending application-layer data, setting a distinctive stop error code in the fully-read case, or by closing the connection?
Good point @Ralith, thanks for raising it! I was thinking about the case of e.g. a simple request-response-protocol, where after writing the response the remote doesn't know if the response was received unless finishing the stream succeeded. But then again, in that case there is not much that the remote can do anyway when an error happens; so it would only be of informative value.
So after rethinking this, I think I agree that as an application, we can never rely on it anyway. The is_fully_acked
won't solve the issue, and can only reduce the false-negatives (negative => remote did not read the data).
But then I wonder: why even error in quinn
on a stopped stream in the first place? It may cause false negatives, like the one shown in the test case that I linked above, when the FIN
is not sent directly together with the last write data. So users most likely should choose to ignore the error anyway, right?
Write errors are emitted on stopped streams so that the sender doesn't waste resources trying to send data that will never be read. This is important regardless of what, if any, application-layer semantics are assigned to the error.
If you want to interpret Stopped
errors as application-layer errors, the best way to avoid false negatives is to always read streams to the end unless an application-layer error occurs, ensuring that your receiver never emits Stopped
under normal conditions. I think this is the idiomatic QUIC application protocol design pattern.
We should probably have some discussion of this in the documentation of the Stopped
error.
Right now, as a user there is no way of knowing whether all write data on a stream has been acknowledged, until/ unless
finish
resolves with anOk
. There is already an existing private methodquinn_proto::SendBuffer::is_fully_acked
. It would be great if this could be publicly exposed e.g. by adding a new methodis_fully_acked
toquinn_proto::SendStream
andquinn::SendStream
.If you are open to it, I would create a PR to add them.
Motivation
The information on whether all write data has been received by the remote is useful for us when dealing with a stopped stream. Consider the following scenario: It may happen that both peers at the same time want a stream to close in one direction (due to some application-layer logic). In that case the one peer stopps the stream, while the other tries to finish it. On the sending side, it currently causes a racy
WriteError::Stopped
if theSTOP_SENDING
is received before theFIN
was acknowledged. If that happens, we unfortunately can't differentiate between the following two cases:RecvStream
because it knows on the application layer that now more data will be readUsually in the 2. case the
FIN
would be sent together with our last write data and the peer would not stop a stream if it already received aFIN
for it. However, if there is even a slight delay between the lastwrite
call and the call tofinish
, we may run into the behaviour described above. The issue can be reproduced with this test: https://github.com/elenaf9/quinn/blob/f6f8656ff4a14611a33337742ab3840f125f0241/quinn/src/tests.rs#L728-L784Knowing if all write data has been acknowledged helps us to differentiate between case 1 and case 2.
An alternative (if you agree that the above scenario is an issue in the first place) would be to include that information in the
WriteError::Stopped
type, or directly changefinish
to only return aWriteError
if there is any unacknowledged write data.