Stats for congestion control and bandwidth estimation

vasilvv commented 5 years ago

One of the use cases we are interested in is media streaming; when streaming media, the application can often decide to change the amount of data it sends based on how much bandwidth it expects to have available. Since all of the transports we define are congestion-controlled, we already naturally have to make some form of a guess regarding how much data the path can handle (even though it can be as rudimentary as CWND/RTT).

We should provide an API that lets the underlying transport library expose this kind of data to the Web application. My intuitive idea would be estimateBytesAvailable(time), or even estimateBytesAvailable(time, p), for models that accept the target probability of not oversending (for fancier models that accept such parameter, e.g. Sprout).

cc @keithw, who is an expert on this topic and might have a much better idea of how this API should look.

pthatcherg commented 5 years ago

I agree that providing information about the transport is important for the application to adapt to network conditions. The app may want estimated bandwidth and RTT. Currently the impl in Chromium provides three pieces of info easily in the C++ that could be exposed up to JS:

https://cs.chromium.org/chromium/src/net/third_party/quiche/src/quic/quartc/quartc_session.h?g=0&l=130

They are: bandwidth_estimate, pacing_rate (the rate at which the congestion controller wants to send right now, which may be higher or lower than the estimated bandwidth), and RTT, and they seem to be fired whenever we receive an ack:

https://cs.chromium.org/chromium/src/net/third_party/quiche/src/quic/core/quic_sent_packet_manager.cc?g=0&l=297

Every ack might be a bit too frequent, but perhaps an event throttled to some reasonable frequency would make sense (rather than polling).

keithw commented 5 years ago

Thank you for looping me in. I've skimmed the current WebTransport draft (at least the DatagramTransport-related parts) and pondered your question for a bit.

I think what you're trying to do here is super-interesting, and the readyToSendDatagram API where the application effectively gets an upcall when the congestion controller is willing to send one new datagram is a cool idea. One challenge is that often datagrams are not independent to the application, and an app that cares about the latency of individual messages doesn't want to have a 5-datagram-long message and get blocked after sending the first 4 datagrams. (If it knew that was going to happen, the app may have been able to send a shorter message.)

If estimateBytesAvailable(t) is a way for apps to ask, "How many times is readyToSendDatagram going to be willing to send a datagram in the next t milliseconds?" I think that's a helpful thing that can benefit an application (it's basically what we used in Sprout & Salsify).

But there's a difference between the way most UDP apps use this information and the way the current draft wants it to work. In Salsify/Sprout/Mosh, the transport gives the application its best guess about how much data can safely be transmitted (with some high probability) in the next n milliseconds, and then the app basically sends that amount of data in reliance on the estimate. If the estimate turns out to have been too high (e.g., the congestion controller changes its mind after receiving more information and becomes more conservative), these apps can react by bailing out midway (and sending a shorter message instead), or by proceeding headfirst and sending a too-big message and then making up for it later, effectively borrowing against a future allocation. (E.g., RFC 3448/5348 "TCP-friendly rate control" -- matching the long-term allocation of RFC-compliant AIMD TCP but on a longer timescale with slower variation than actual AIMD TCP.)

In the current draft, the app doesn't have this kind of flexibility -- there is a congestion controller on the other side of the API boundary that blocks or drops datagrams and doesn't have a specified way for the application to borrow against a future allocation (to achieve slowly-varying rate control) or even to give the app advance notice that it's not going to adhere to a prior estimate. That seems like a real challenge and it defeats the reason a lot of apps want to use UDP.

As far as I know, this kind of arm's-length separation between a congestion-control scheme and a datagram-based application has never been successfully executed by anybody (e.g., DCCP was a prior not-really-successful attempt). I'm not saying it's impossible, but I don't know of a successful example ready for standardization.

I can also see some additional practical challenges -- is the congestion controller really going to have no internal sender-side buffer and be willing to wait for the application each time it's "ready" to send something? This could be an unpleasant wait and could hurt performance (or the congestion-controller might even no longer by "ready" by the time the promise gets around to calling sendDatagram). Or is the congestion-controller going to force the promise returned by a prior readyToSendDatagram, let the promise call sendDatagram, let that SendDatagram call return successfully, and then stick the datagram onto an outgoing queue where it might have to wait for an inbound ACK (or a timer) before being put on the wire?

I wonder if you have a corpus in your heads of interesting UDP-based systems and some shared understanding of which ones you want to be implementable with this API and which you don't. Because my thinking is that the underlying congestion control, and the way you draw the API, is going to be key to how broadly useful this turns out to be.

Applications end up using datagram interfaces for a bunch of different reasons, including:

wanting different congestion-control behavior than the kernel's TCP gives them (e.g. the WebRTC.org codebase with its GCC scheme; Skype/FaceTime with similar schemes; BitTorrent's use of uTP/LEDBAT and MixApp's use of TCP Vegas-over-UDP to be "nicer" to cross traffic; TFRC's use of slowly-varying, but still long-term-TCP-friendly, rate control; Sprout/Salsify with their explicit modeling of evolving network capability to minimize queue buildup; QUIC with Google's desire to iterate on congestion control quickly by updating Chrome instead of depending on OS vendors; or simply everybody that wants to experiment with new congestion-control or loss-detection behaviors different from the default implementation)
wanting more information from the congestion controller than the kernel's TCP gives them (e.g., WebRTC.org and Salsify use congestion-control information to change video quality)
wanting sender-side behavior different from a reliable byte stream (e.g., when a datagram is lost, don't necessarily resend the contents of that particular datagram -- most real-time media and screen-sharing apps do this, and Mosh does it for text terminals)
wanting receiver-side behavior different from a reliable byte stream (e.g., schemes like Minion where the wire format is still TCP and the sender is unchanged, but the receiver gets access to incoming datagrams out-of-order; or schemes like Mosh/QUIC/OpenVPN where the client can roam to a new IP/port with a single authenticated packet)
other exotic features (multicast, or NAT traversal, or 1RTT round-trip transactions to servers with no pre-existing connection-specific state, as in the case of DNS or some new secure protocols -- TCP is not supposed to do this because the accepting application isn't supposed to get any data from the SYN payload until the client ACKs the server's SYN, as a security measure to make sure the client really controls its IP address)
wanting to know exactly when datagrams go out on the wire (e.g. for time estimation, in NTP or LEDBAT)
wanting to tunnel datagrams that are already subject to congestion control at a higher level (every VPN)

The current draft seems pretty heavily directed at use cases 3 and 4, and this issue is flirting in the direction of 2 (but see my comments above). My concern would be that in practice, a lot of UDP-using apps really care about case 1: they don't just want more information from the congestion controller; they really do want different congestion-control behavior, or tighter integration between the app and the congestion controller, or "binding" estimates (instead of just best guesses) from the congestion controller, or a different latency-vs-throughput tradeoff than they get by default, etc.

So if you want to support those use-cases (real-time video, probably some first-person-shooters, etc.), the API may need to support some "actuation" and not just information. As a first step, you could imagine letting the app express where it wants to be on the latency-vs-throughput tradeoff space and maybe the short-term fairness vs. long-term fairness tradeoff space, and having the browser choose an appropriate (but still safe) congestion-control behavior as a result.

I do think app developers who want to push the envelope (and most apps use UDP because they want to push the envelope somehow) are going to be curious about the threat model underlying, "All stream data is encrypted and congestion-controlled" and what this really means. We're talking about datagrams sent from an origin-controlled JavaScript program, to the origin. Is the congestion control going to be on a per-stream basis? Then if my needs are different from the default congestion control, I'm going to want to open 1024 streams and round-robin my datagrams among them, and then do the congestion control myself. Or on a per-origin basis? Then I'm going to want to have 1024 iframes from different origins, all going back to the same place, and again do the congestion control myself.

Given that most users are one click away from downloading an Android or iPhone or normal-computer app that has free access to the operating system's datagram interface and can send whenever it wants (and given that most webpages already cause browsers to open tens or hundreds of TCP connections, with the kernel doing congestion-control on a per-connection basis), what is the anti-congestion or safety-against-bad-apps property that the spec really wants the browser to enforce?

Atrius commented 5 years ago

WebRTC-over-QUIC efforts have discussed this point at some length internally, and with Victor and others on the gQUIC team.

We're now looking at using WebTransport for media as part of this effort, and I'm particularly interested in the implications for congestion control.

I'd summarize what I want as a more cooperative relationship between congestion control and the application.

I'd like to see something that lies somewhere on a spectrum between:

The ability to choose a media-friendly congestion-control algorithm (eg. GCC) and get feedback similar to what we get from the QuartcSession callback Peter pointed out
The ability to ship my own congestion controller running on top of a datagram transport (similar to the vpn tunnel use-case)

The closer we go to option 2, the more insight I'd want into transport-layer feedback: send and receive timestamps, acks, etc. I'd effectively be writing portions of the transport (eg. the send algorithm) myself and shipping them as a WASM module. I'm perfectly comfortable doing that, but I'm not sure it would make a great API for the web.

Related to the threat model, Victor pointed out to me that unlike raw UDP, QUIC datagrams still have acks and the transport can still see what's happening on the network. There might be a middle-ground option between "you get the congestion controller we give you" and "no congestion controller", where the transport runs a safety-net congestion controller.

I haven't put a lot of thought into this yet, but what I have in mind is something like BBR, except if I keep latency under control myself, it won't interrupt me for PROBE_RTT and low-gain cycles, and it will let me choose when I want to probe for more bandwidth (enter a high-gain cycle), but it will enforce some reasonable limits. If I send for a whole high-gain cycle unsuccessfully, it might force a subsequent low-gain cycle, and it might enforce some 'cooldown' between high-gain cycles.

keithw commented 5 years ago

The standardized "safety-net" approach would probably be to have the browser enforce an RFC 8084 "Network Transport Circuit Breaker" on whatever the app decides to send.

The threat model of trying to restrain an unfair/oversending app still doesn't quite make sense to me if the circuit breaker is per-stream or per-origin, since anybody who wants to circumvent the control is just going to open up a lot of streams and keep switching after the circuit breaker is triggered. You could consider having a single per-page circuit breaker (and, like, severely throttle all WebTransport for 60 seconds when the circuit breaker triggers?), but I wonder if you might find that simultaneously too restrictive (because it's a big penalty for the whole page) and also not restrictive enough (because the streams aren't really congestion-controlled by the browser until there's been a violation of basic norms for a significant amount of time).

My view is also that you'd want the API to encourage a pretty arm's-length relationship between the app and browser, and not introduce a sensitivity to the behavior or exposed state variables of a particular congestion-control scheme. Other browsers are going to choose different schemes or a different circuit breaker, and you wouldn't want apps written against this API for Chrome (using BBR or GCC) to end up with dramatically lower performance elsewhere because they have an unwitting latent sensitivity to whatever Chrome does or exposes. (BBRv1 turned out to be an "unfair/oversending app" itself in some cases [1], and BBRv2 is still under development and internal to Google. These things are evolving and the community is not always in agreement about how to evaluate new schemes. So I don't think baking BBR into a web standard, even de facto in that apps would be coded in a way that ends up depending on its behavior or its state variables, would be wise.)

Which is to say, an API that ends up like your "option 2 plus a long-term per-page circuit breaker/safety-net with a big penalty on breaking it" would seem reasonable to me, so maybe we are in agreement, but that's pretty far from the spec's current language on having all datagrams be congestion-controlled.

[1] https://platformlab.stanford.edu/Presentations/2019/retreat-2019/Keith%20Winstein.pdf, slides 15-18

Atrius commented 5 years ago

Yeah, I think we're in agreement that no built-in congestion-control plus an RFC 8084 circuit breaker sounds reasonable.

I also agree that the specifics of the congestion controller shouldn't be exposed to the app. I used BBR as an example because I'm familiar with it, and I think it's what the RTCQuicTransport origin trial uses, and I've heard it thrown around as the proposed congestion controller for WebTransport, too.

I'd like to either see one of:

The app requests "latency", "throughput", or "best-effort" and the browser chooses an appropriate congestion controller (eg. GCC, BBR, or LEDBAT) and then exposes the types of signals Victor and Peter mentioned.
The circuit-breaker option and I ship my own congestion controller based on raw congestion feedback signals.
Some middle ground? As you say the circuit-breaker might be too open to short-term abuse, yet too difficult for well-behaved applications to stay within the envelope 100% of the time and too punishing when they run amok.

The circuit-breaker idea does raise the question of whether QUIC datagrams should be congestion-controlled at the transport layer. I recall a leaning in that direction at the last IETF side-meeting on QUIC datagrams, but there were definitely use-cases like VPN raised which make it less clear.

juberti commented 5 years ago

Approach 2) makes sense to me, for the reasons articulated above. I do suspect an uncapped transport N open transports presents a different sort of risk than simply N open transports, but I think we're now focusing on the issue of abuse rather than fairness*, which seems more tractable.

cwmos commented 5 years ago

I have read this thread with great interest and I want to provide my two cents.

I really like the idea of allowing WebApps to innovate on congestion control algorithms and not have to rely on whatever congestion control algorithms are built into browsers. But I do have a concern about the performance and the timing if all JavaScript can do is to send a UDP packet immediately out on the network card and to receive a UDP packet as soon as it arrives:

Is JavaScript fast/accurate enough to handle each UDP packet separately for sufficiently high bitrates? For example, to implement something like TCP Reno where an incoming ACK triggers the immediate sending of the next IP packet.
Is the timing in JavaScript accurate enough to make good RTT and one-way-delay measurements?
Is the timing in JavaScript accurate enough to pace the sending of UDP packets with a specific rate in order to implement a rate-based congestion control algorithm?

I suspect the answer to these questions is “no”. This is most clear in the question in the last bullet since Window.setTimeout to my knowledge has very limited accuracy and that would need to be the mechanism that triggers the sending of a UDP packet.

Because of this, I think it could be very helpful if the API included possibilities to:

Set a specific send rate for data - and of course allow this rate to be updated at any time.
Request precise timing information for when a packet was sent and received to allow precise RTTs and one-way-delay measurements to be made. Ideally, the send time would be automatically available at the receiver of the packet.

With these two primitives, I believe that most rate-based congestion control algorithms - including GCC - could be implemented in JavaScript without the need to handle each UDP packet in real time.

One use case I am particularly interested in is P2P systems for large scale distribution of video data while it is being consumed. Several such commercial systems built on top of WebRTC DataChannels already exist today. Such systems want to be very non-aggressive in the upstream direction of end-users. The reason for this is that an individual end-user does not benefit from contributing upstream to other end-users. So it is desirable to only use upstream bandwidth if it does not impact other traffic.

For such a use case, I think it would be useful if the browser implemented an additional congestion control algorithm (such as TCP Reno, BBR, GCC, Ledbat, whatever) on top of the congestion control algorithm implemented in JavaScript. This would ensure that even down on the level of each individual packet, we would never be more aggressive than this additional algorithm. A similar approach is used by Ledbat which makes sure it is never more aggressive that TCP Reno.

With such a mechanism we would not need to have an RFC8084-style circuit breaker in the browser and ill-behaving WebApps will have a harder time congesting the network than if we relied on a circuit breaker. But obviously it will also rule out some use cases that a circuit breaker would allow for, so an idea could be to allow JavaScript to select between the two mechanisms.

vasilvv commented 5 years ago

Regarding letting the application to choose the congestion control algorithm, I've been thinking about it, and there are various extents to which we can go. I'll refer to them as "levels".

Level 0: the browser just uses its default congestion control algorithm that it uses for HTTP traffic. This is where the spec is currently.

Level 1: we allow the web application to switch between different algorithms. We could just export the full list of algorithms with their names, but I would prefer us to let the application to specify only the category of CC ("bulk" for Reno/CUBIC/BBR, "best-effort" for LEDBAT, "real-time" for WebRTC-like algorithms). This should be relatively simple to add to the spec, so we should just do it.

Level 2: we notify the web application about CC-level events (packet sent, acked, lost) and let the application set pacing rate and congestion window. This, of course, requires a "limiter" CC algorithm to run on top of whatever the web app runs, and designing one is a research topic (Keith points to RFC 8084, and that's a good start, though I am not sure it's enough). Designing a good API for this is also a research topic, but there's some prior work that might be helpful (e.g. this).

Level 3: instead of providing an API to set pacing rate and congestion window, we can let the web app load a WASM blob that is ran by the QUIC stack itself instead of the congestion control. This has almost native-level capabilities, but a much higher complexity and worse security properties, so I am not sure it's worth it.

keithw commented 5 years ago

@cwmos The current draft spec seems to assume that:

JavaScript is low-latency enough to be running in the critical path for putting individual datagrams on the wire one by one (the DatagramTransport readyToSendDatagram method returns a promise that will be resolved when the DatagramTransport can send an individual datagram, and then sendDatagram is specified to actually send a single datagram), and
the various congestion-control scheme implementations can efficiently signal when they are "ready" to actually send another datagram (not just ready to accept another datagram on a sender-side queue, for them to drain at their leisure), and are willing to wait around at least a little bit for JavaScript to supply the datagram in question after the promise is resolved.

If these assumptions can't practically be upheld in an implementation (you seem to also be a bit dubious about this in your comment), it seems to me that this is a bigger issue than just bandwidth prediction; the DatagramTransport interface will need to be refactored from what's there now.

keithw commented 5 years ago

@vasilvv I do want to keep asking: what is the threat model behind the "security" risks that can be cured with a "limiter" CC algorithm?

To slightly expand on what I wrote above, given that:

Most traffic, and therefore most "cheating" on congestion control, occurs in the downstream direction--which this spec so far makes zero effort to control
Most webpages already cause browsers to open tens or hundreds of TCP connections, with the kernel doing congestion-control on a per-connection basis (my guess is that the same behavior will likely happen with DatagramTransport congestion control, depending on the scope of the control, e.g., per-connection, per-origin—in my view the "scope" question matters a lot), and
Most users are one click away from downloading an Android or iPhone or normal-computer app that has free access to the operating system's datagram interface and can send whenever it wants

... What is the anti-congestion/fairness/safety-against-bad-apps property that the spec really wants the browser to enforce?

I think it might be best to first answer this question (i.e. specify the threat model and desired properties) and then work backwards to figure out what features of this new API need to be governed or limited by mandatory controls running in the browser.

Here would be my own suggestion as a straw-man along these lines: "the safety property that the browser enforces is to make sure that no matter how the page uses the WebTransport interface, each page will send outgoing traffic that is, in total across all WebTransport connections from that page, no more aggressive than four classical AIMD connections averaged over a 5-second sliding window. Downstream traffic is out of scope and is uncontrolled by the browser."

This gives some wiggle room for "type 1" apps (e.g. innovations in app-specific congestion control) because the control is done over a longish-term sliding window and does not have to match the packet-for-packet cwnd evolution of classical TCP Reno. But it also governs the behavior of the page in the aggregate to make sure that a page cannot be arbitrarily abusive or unfair by opening lots of DatagramTransports.

On the separate question of what API to support, my straw-man suggestion would be to give apps a choice between your "level 1" and "level 3."

In "level 1" ("browser-controlled") mode, the app just gets to pick from a small menu of choices—maybe just "prefer throughput" and "prefer low delay". And the guarantee would be that the app that chooses the "browser-controlled" mode will never trip the circuit breaker.
In "level 3" ("raw") mode, the app gets raw access to send a datagram whenever it wants, but whenever it violates the 5-second sliding window rule, it gets an extremely severe penalty (maybe the app gets one warning when it breaks the circuit breaker the first time, and then the second time the entire page gets killed).

alvestrand commented 5 years ago

The most "fun" things to defend aginst are channel monopolization and (D)DOS attacks. Channel monopolization will happen every time you have someone sending with a congestion control algorithm that's aggressvie enough to make TCP (and TCP cohabitants) stop sending, or send far less than a "fair share". DOS attacks can happen whenever someone is able to send a significant traffic volume towards a recipient that isn't interested in receiving it. Circuit breaker was specially tailored to be a backstop to help limit the DOS case - when you are pretty sure the recipient doesn't want to receive your traffic (no acks), you stop sending. It was standardized mostly because we didn't have a single CC algorithm we (WebRTC folks) could agree on; it's no replacement for CC.

The Web security model is intended to ensure that "nothing fatal happens if you run your enemy's Javascript". I think we can't get away from running a CC model in the browser that the user can't override or disable - in "ultimate freedom mode" (Vasili's mode 3), the user should be given free choice of what packet to send when - but the browser should refuse to put it on the network unless it fits within the CC envelope that the browser's CC has computed.

I see Vaslii's lower modes more as the browser giving more help with choosing what packet to send.

keithw commented 5 years ago

Hmm, I think we're still talking past one another. Let me try to say it a different way.

I don't think DoS attacks where the recipient isn't interested in the traffic are relevant here. All of the datagrams we're talking about controlling are being sent from origin-controlled JavaScript (let's assume this API is HTTPS-only, which it probably should be) to the origin. But the system probably should worry about fairness to competing traffic, and about DoS to network intermediaries along the way.
I don't think any comment here has advocated exposing an unrestricted datagram socket, i.e. an absence of any browser-enforced limiter on upstream-direction DatagramTransport datagrams.
But, I don't think it would be a good spec (or a good system) to say, "Each DatagramTransport object will run a congestion controller, and the readyToSendDatagram promise will only be resolved when that controller is ready to send a new datagram on that DatagramTransport."

For a few reasons:

A lot of media flows want to be fair to the typical TCP congestion controllers (e.g. Reno) on a multi-RTT timescale but not on shorter timescales. This is the point of RFC 3448/5348 TFRC as I understand it -- it's fair to TCP Reno on average, but doesn't increase or decrease as rapidly. If you force all datagram applications to back off as fast as Reno or BBR (or whatever the built-in controller is) in the presence of a single congestion signal, they're not going to be happy -- even if they would have ultimately been fair to Reno over a longer timescale.

This is why I proposed the "you get to send as much as four CC-controlled flows, averaged over a 5-second timescale" language.

We could amend this to make it a little more restrictive -- e.g., "the page gets to send, in total across all WebTransport connections, no more aggressively than 16 CC-controlled flows. This limit is imposed at all times. In addition, the average upstream traffic over a 5-second timescale must be less than 4 CC-controlled flows."

It's child's play to aggregate multiple datagram sockets together to produce a single mecha-socket, especially since they're all going to the same place. So if the CC is per DatagramTransport, it doesn't really prevent any of the DoS/fairness badness we're worried about.
I'm a bit skeptical that real congestion-control scheme implementations are going to be easily refactored to expose a promise that only gets resolved when they are literally ready to put a packet on the wire, and also willing to hang around for a bit (after resolving the promise) for some JS to execute to decide what the datagram's payload turns out to be.

I definitely love the idea of an "upcall"-based API to congestion control (this is what Mosh does -- it only calculates the payload contents once the transport is willing to send a packet), but I just don't know how practical it is when the "upcall" really is an arm's length API between the browser and some origin-controlled JavaScript and we're talking about doing stuff on short timescales. Maybe it really is practical (meaning, maybe at least CUBIC, GCC, and BBR can be refactored in this way, and there's a reasonable answer for what happens if the CC's opinion of the congestion window has changed in between when it resolved the promise and when the promise actually produced a payload), in which case, great!

gterzian commented 4 years ago

Regarding letting the application to choose the congestion control algorithm, I've been thinking about it, and there are various extents to which we can go. I'll refer to them as "levels".

In terms of API's I think you could look into the Web Audio API for possible inspiration.

"Level 1" sounds a bit like what that spec does in terms of offering various "native" audio processing capabilities to web developpers by way of various AudioNodes, which are natively implemented, but configurable by the developer.

And then "level 3" sound a bit like the AudioWorklet, which is essentially a way for the developer to provide a custom audio-node by providing a JS/Wasm blob for it.

I just don't know how practical it is when the "upcall" really is an arm's length API between the browser and some origin-controlled JavaScript and we're talking about doing stuff on short timescales.

With regards to performance and security, I think again the audio API could be quite interesting as a source of inspiration.

It seems mostly build around a separation between the "control thread", which is essentially the "webpage" from where the audio capabilities are used, but not where the actual audio processing happens.

Then there is an "rendering thread", where the actual audio processing happens, via user-configered but natively implemented nodes and/or fully programmable audio worklets.

Both the "control thread" and the "rendering thread" I believe would be running in the same low-capability "content process"(where user content runs), and communicate with a backend in another process with access to system resources. It's that backend that would, via IPC, call into the "rendering thread" at each processing interval.

So the "rendering thread" is a bit like a web worker, although it runs a specialized loop meant to be usable in a low-latency context of audio processing. The "control thread" is just running a normal HTML event-loop, and does receives some events and so on, but those are not involved in the actual audio processing, hence are not subjected to the same kind of performance requirements.

See https://webaudio.github.io/web-audio-api/#processing-model

aboba commented 3 years ago

From RTP over QUIC Section 4.1:

" Additionally, a QUIC implementation MUST expose the recorded RTT statistics as described in Section 5 of [RFC9002] to the application. These statistics include the minium observed RTT over a period of time ("min_rtt"), exponentially-weighted moving average ("smoothed_rtt") and the mean deviation ("rtt_var"). These values are necessary to perform congestion control as explained in Section 4.2."

aboba commented 2 years ago

RTP over QUIC Section 5.1 describes the statistics necessary for application congestion control (most of which are not provided in WebTransport).

jan-ivar commented 2 years ago

We have half of those (timestamp, min_rtt, smoothed_rtt, andrtt_var).

Of the remaining, pkt_arrival (for client-side sending) would presumably have to come from the server through some app back channel, right?

That leaves:

pkt_departure — I believe await writer.write(data); gives you this?
latest_rtt — should be trivial
ecn — from a search I found ECN-in-QUIC... Is that the right link? Has anyone implemented it?

Correlating departure times with arrival times from the server then seems like an app problem.

Is providing these two stats all we need to offload this problem to the app? If so, should we add them?

mengelbart commented 2 years ago

There were two reasons why we included different RTT values in the first draft. 1. If the congestion controller uses the acknowledgements and departure and arrival times only to calculate an RTT, we can directly use the RTT that is calculated by QUIC. 2. Since QUIC acknowledgements do not include an arrival timestamp (pkt_arrival), we could use the RTT to calculate an approximation of the arrival time. pkt_arrival would be more accurate and could be added to QUIC using an extension like draft-smith-quic-receive-ts or draft-huitema-quic-ts.

If the arrival time is not available in QUIC through any extension, it could also still be implemented at the application layer, but that would use more bandwidth and may be less precise if the application cannot access the exact timestamp at which packets were sent/received.

I think quic-go implements ECNs, but I don't know if the congestion controller acts on it. I don't know about other implementations.

jan-ivar commented 2 years ago

Meeting:

ecn doesn't seem like a thing to expose to JS
Where are the above measurements made? May be differences between QUIC native apps and WebTransport (browsers) here
Might get more info from the next AVTCore meeting

yutakahirano commented 2 years ago

pkt_departure — I believe await writer.write(data); gives you this?

IIUC RTP over QUIC uses datagrams, and the promise returned by writeDatagrams gives you effectively nothing.

aboba commented 2 years ago

AVTCORE WG virtual interim slides are here Minutes are here.

Comparison of existing stats against those supported by RFC 8888 and draft-engelbart-rtp-quic is here.

Summary: latest_rtt, packet_deparature, packet_arrival times missing, as well as ecn and ACK info.

jan-ivar commented 2 years ago

... the promise returned by writeDatagrams gives you effectively nothing.

@yutakahirano I've filed https://github.com/w3c/webtransport/issues/400 on this.

jan-ivar commented 2 years ago

It sounds like we want a stat for packet_departure at least (I don't see how packet_arrival of data from the server is relevant stat, unless this is about ACKs?)

Regarding where this stat goes, I have a question for the group:

IIUC RTP over QUIC uses datagrams

Are we only talking about media streaming over datagrams?

jan-ivar commented 2 years ago

Meeting:

shoot I forgot to ask this question

wilaw commented 2 years ago

Does the application require absolute packet-arrival and packet-departure times? I would think that any algorithm attempting RTP over WT would care about the delta between the two (i.e packet-transfer-time) more so than the absolute timestamps? If so, how does this differ from RTT/2 ?

alvestrand commented 2 years ago

https://www.rfc-editor.org/rfc/rfc8888.pdf gives details of a feedback format that seemed to the authors to support all the requirements of NADA, SCReAM and the Google Congestion Control algorithm as they were understood at the time. If we can get all that information, I think we have enough information.

The important thing for GCC (which I was a bit familiar with at one point) is that it tries to detect changes in transit delay that indicate queue buildup and tries to act on it before the queue is full; for that, the more information you can have about packet arrival times, the better.

aboba commented 2 years ago

While the browser has the ability to measure packet departure and arrival times, an application will have difficulty measuring this with much accuracy. It's important to not to mix in queueing delays (in WHATWG streams or the QUIC stack send/receive queues). There are proposals for QUIC timestamps, Receiver timestamps and ACK frequency that can help provide the estimates with greater accuracy and frequency.

jan-ivar commented 2 years ago

Meeting:

one issue: per-stream is about bytes, not packets
I'm sending data down a stream, how fast can I send it?
why can't UAs make stats that tell me what my estimated bandwidth are directly?
Easier to add stats for the entire connection, would go into getStats.
This issue has evolved into discussing stats like packet_departure, which is more for apps wanting to implement their own congestion control. Propose (renaming this issue and) opening a new issue for UA provided bandwidth estimation

aboba commented 1 year ago

Question: Are things that are to be used for congestion control better provided via events rather than stats? You don't want to encourage frequent polling of stats. But if the application is looking to calculate a target bitrate based on the info described in RFC 8888 or draft-ietf-avtcore-rtp-over-quic, then an Event might make more sense.

w3c / webtransport

Stats for congestion control and bandwidth estimation #21