First cut at describing a new path API

huitema commented 1 year ago

We want to develop a new set of path related API, per issues #1459 and #1412.

The first step here is to propose changes in "picoquic.h" describing the new API. The next stps will be to implement these changes.

alexrabi commented 1 year ago

I'm going to go ahead and butt in to the conversation to give my two cents (since I am currently looking at similar problems in regards to picoquic).

In my use-case, I am primarily focusing on using the datagram API, and I am interested in doing per-datagram scheduling onto the different paths (i.e. using picoquic_callback_prepare_datagram to decide which datagram should be scheduled onto which path, if any at all). For this to work, the application would have to be aware of which path is considered for sending whenever picoquic_callback_prepare_datagram is called. I therefore suggest that the datagram callbacks also use the "stream_id" field to carry the "unique_path_id" similarly to the newly suggested callbacks, such that the application may make such decisions.

In terms of path quality information, I think that the proposed picoquic_path_quality_t structure is a good start. However, from my experience in scheduling for MPTCP and MP-DCCP (including an early MP-QUIC prototype), the information contained within this structure is currently far too course-grained for use-cases such as mine. Ideally, as much information from the congestion control state as possible should be exposed (e.g. bytes_in_transmit, rtt_min, rtt_variant, and more).

huitema commented 1 year ago

@alexrabi Good point about the datagram callback. Providing the path identifier in the callback is a very good idea.

There is a related point concerning the scheduling of the call back. The API function picoquic_mark_datagram_ready states that the application is ready to send datagrams, presumably on any path. Should we add a path identifier there too?

huitema commented 1 year ago

@alexrabi I wonder about the granularity issue. I see the point about "as much quality information as congestion control protocols", but the information itself is the result of decisions by the congestion control protocols, which do set pacing rates and congestion window.

Of course, that information is available if the application opens the "internal" header, but the internal information will change over time as the implementation evolves. The public header "picoquic.h" defines the information that will remain stable, so the application does not need to be updated each time the internal implementation changes. Thus, a "quality" API needs to only expose information that is well defined and unlikely to change with implementation. Consider for example issue #1481 about RTT spikes on Wi-Fi paths. Part of the issue there is that the RTTVAR is computed using a low path filter with a relatively short period (beta set to 1/4), which causes information about past spikes to be quickly forgotten. The fix will probably add an extra variable, maybe RTTMAX, so events that happened long ago can be remembered. So, there will be changes.

We could certainly add min_rtt, rtt_variant and bytes_in_transit to the API, because that's well defined. And that's probably enough to make application-level scheduling decisions.

huitema commented 1 year ago

I am contemplating a different change. I would rather not have the "quality" callback change every time some information changes, because that could happen on every acknowledgement or maybe on every packet. To issue the callback less often, the stack needs to remember the values that it previously announced, and decide when there was a "significant" change. That means pretty much keeping a copy of the "quality" information in the path context. If there is such a copy, then it could be passed along the callback, say by repurposing the "bytes" and "length" parameters to provide a pointer to the up to date "quality" structure for that path, and its length. Would that be a good idea?

Obvious answer: no, it would not. Passing pointers to components of internal structures is a recipe for all kind of troubles. let's not do that.

alexrabi commented 1 year ago

@huitema

There is a related point concerning the scheduling of the call back. The API function picoquic_mark_datagram_ready states that the application is ready to send datagrams, presumably on any path. Should we add a path identifier there too?

A call to specify on which paths datagrams are ready to be sent sounds like a great idea; there could be many reasons why an application may prefer to wait for another path to become available rather than to send immediately. Currently there is no way to signal this from the application. This may need to be coupled with some change in picoquic_select_next_path_mp such that a path is "penalized" in the path selection when it is application limited, so that the path is not selected over and over again.

alexrabi commented 1 year ago

@huitema I see your point regarding the stability of information in the public header. However, if it is expected for datagrams to be scheduled on paths at the application level, then non-stable, congestion control algorithm specific information can be very useful. For example, BBR can provide some useful information that can be used to keep the latency low, which other algorithms like New Reno and Cubic do not. Perhaps there could be some "advanced" API function (non-callback related, keep that stable and generic) that the application may call in order to fetch such information?

huitema commented 1 year ago

@alexrabi Your point about scheduling and path preferences is well taken. Affinity means the application will state it can send data on a specific path -- does not matter whether through datagrams or through streams. If the stack then goes on waking up the other paths, we have a problem.

That means actually implementing path affinity implies reworking the scheduler. Currently, the scheduler maintains a global "next wake time". If we want to do path affinity properly, we want to manage a wake time per path. But it is complicated, because some data can be sent on any path. Complicated means "will take longer than initially expected." But that probably means we still need to do it in this PR, because the goal of the signalling API is to enable path affinity.

huitema commented 1 year ago

There is a little dilemma regarding how to handle streams that have no affinity specified. Suppose that we have two streams, S1 and S2, and two path, P1 and P2. The application sets affinity of S1 to any, and affinity of S2 to P2. I think that means:

Part of S1 is sent on P1
Part of S1 and S2 share path P2 according to scheduling defined by priorities -- S1 first if FIFO, S1 and S2 alternate if round robin?

In he round robin case, the stream S1 will be put back to the end of the queue whenever packets are sent on P1, so in practice S2 will get more of P2's capacity than S2.

I think that makes sense, but I would like a second opinion.

huitema commented 1 year ago

Supporting stream affinity seems simpler than I thought. There is more testing to do, such as seeing what happens if there are two parallel streams with affinity to two paths, or if the affinity path breaks before the stream is done, but it looks simple enough:

do not schedule streams on another path if the affinity is set
select the affinity path as next path if the affinity stream is the next to be sent.

I need to complete these tests, and add something similar for datagrams. Bit of design work to do there, plus a fair amount of new test code.

huitema commented 1 year ago

And with that, we now have path affinity support for datagrams as well!

private-octopus / picoquic

First cut at describing a new path API #1482