quinn-rs / quinn

Async-friendly QUIC implementation in Rust
Apache License 2.0
3.76k stars 380 forks source link

Add `SendStream::is_fully_acked` #1487

Open elenaf9 opened 1 year ago

elenaf9 commented 1 year ago

Right now, as a user there is no way of knowing whether all write data on a stream has been acknowledged, until/ unless finish resolves with an Ok. There is already an existing private method quinn_proto::SendBuffer::is_fully_acked. It would be great if this could be publicly exposed e.g. by adding a new method is_fully_acked to quinn_proto::SendStream and quinn::SendStream.

If you are open to it, I would create a PR to add them.

Motivation

The information on whether all write data has been received by the remote is useful for us when dealing with a stopped stream. Consider the following scenario: It may happen that both peers at the same time want a stream to close in one direction (due to some application-layer logic). In that case the one peer stopps the stream, while the other tries to finish it. On the sending side, it currently causes a racy WriteError::Stopped if the STOP_SENDING is received before the FIN was acknowledged. If that happens, we unfortunately can't differentiate between the following two cases:

  1. The peer stopped the stream without ever reading our data
  2. The peer read all our data, and then dropped / stopped their RecvStream because it knows on the application layer that now more data will be read

Usually in the 2. case the FIN would be sent together with our last write data and the peer would not stop a stream if it already received a FIN for it. However, if there is even a slight delay between the last write call and the call to finish, we may run into the behaviour described above. The issue can be reproduced with this test: https://github.com/elenaf9/quinn/blob/f6f8656ff4a14611a33337742ab3840f125f0241/quinn/src/tests.rs#L728-L784

Knowing if all write data has been acknowledged helps us to differentiate between case 1 and case 2.

An alternative (if you agree that the above scenario is an issue in the first place) would be to include that information in the WriteError::Stopped type, or directly change finish to only return a WriteError if there is any unacknowledged write data.

djc commented 1 year ago

Yeah, sounds okay to me! I think the solution where finish only returns an error if some data was not acked sounds most elegant.

Ralith commented 1 year ago

There are subtle hazards to assigning application-layer meaning to transport-layer ACKs. An endpoint ACKing a packet does not mean that the application layer has seen or will ever see the data borne in it. For example, your receiver might stop the stream prematurely, then receive packets that finish it, then acknowledge those packets. Packet loss or reordering might even cause the sender to receive those acknowledgements before the STOP_SENDING, which would look identical to a fully read stream. Similarly, if the receiver stops after reading all expected data, the sender might still see the STOP_SENDING before all relevant ACKs. Even now, finish succeeding is not a guarantee that the peer never called stop.

What does your application-layer protocol need to distinguish these cases for? If there's an error that must be communicated, could that perhaps be done instead by sending application-layer data, setting a distinctive stop error code in the fully-read case, or by closing the connection?

elenaf9 commented 1 year ago

Good point @Ralith, thanks for raising it! I was thinking about the case of e.g. a simple request-response-protocol, where after writing the response the remote doesn't know if the response was received unless finishing the stream succeeded. But then again, in that case there is not much that the remote can do anyway when an error happens; so it would only be of informative value.

So after rethinking this, I think I agree that as an application, we can never rely on it anyway. The is_fully_acked won't solve the issue, and can only reduce the false-negatives (negative => remote did not read the data). But then I wonder: why even error in quinn on a stopped stream in the first place? It may cause false negatives, like the one shown in the test case that I linked above, when the FIN is not sent directly together with the last write data. So users most likely should choose to ignore the error anyway, right?

Ralith commented 1 year ago

Write errors are emitted on stopped streams so that the sender doesn't waste resources trying to send data that will never be read. This is important regardless of what, if any, application-layer semantics are assigned to the error.

If you want to interpret Stopped errors as application-layer errors, the best way to avoid false negatives is to always read streams to the end unless an application-layer error occurs, ensuring that your receiver never emits Stopped under normal conditions. I think this is the idiomatic QUIC application protocol design pattern.

We should probably have some discussion of this in the documentation of the Stopped error.