dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.95k stars 4.65k forks source link

QUIC Datagram API #53533

Open wegylexy opened 3 years ago

wegylexy commented 3 years ago

Background and Motivation

QUIC is finally a proposed standard in RFC, with HTTP/3 and WebTransport on the way. To prepare for WebTransport and other use cases, such as unreliable message delivery in SignalR, .NET should implement QUIC datagram API, as MsQuic already supports it, to enable higher-level APIs such as WebTransport. Until WebTransport is standardized, it may be used today to stream real-time game state and ultra low-latency voice data where dropped packets should not be retransmitted. Once this is done, SignalR may also support new features.

Proposed API

namespace System.Net.Quic
{
    public class QuicConnectionOptions
    {
+        public bool DatagramReceiveEnabled { get { throw null; } set { } }
    }
+    public delegate void QuicDatagramReceivedEventHandler(QuicConnection sender, ReadOnlySpan<byte> buffer);
+    public enum QuicDatagramSendState
+    {
+        Unknown,
+        Sent,
+        LostSuspect,
+        Lost,
+        Acknowledged,
+        AcknowledgedSuprious,
+        Canceled
+    }
+    public delegate void QuicDatagramSendStateChangedHandler(QuicDatagramSendState state, bool isFinal);
+    public sealed class QuicDatagramSendOptions
+    {
+        public bool Priority { get { throw null; } set { } }
+        public QuicDatagramSendStateChangedHandler? StateChanged { get { throw null; } set { } }
+    }
    public class QuicConnection
    {
+        public bool DatagramReceivedEnabled { get { throw null; } set { } }
+        public bool DatagramSendEnabled { get { throw null; } set { } }
+        public int DatagramMaxSendLength { get { throw null; } }
+        public event QuicDatagramReceivedEventHandler? DatagramReceived { add { } remove { } }
+        public void SendDatagram(ReadOnlyMemory<byte> buffer, QuicDatagramSendOptions? options = null) { throw null; }
+        public void SendDatagram(System.Buffers.ReadOnlySequence<byte> buffers, QuicDatagramSendOptions? options = null) { throw null; }
    }
}

See https://github.com/wegylexy/quic-with-datagram for implementation (with MsQuic 1.9.0).

Usage Examples

// receive
connection.DatagramReceived += (sender, buffer) =>
{
    // Parse the readonly span synchronously, without copying all the bytes, into an async task
    MyAsyncChannel.Writer.TryWrite(MyZeroCopyHandler.HandleAsync(buffer));
}
// send
var size = Unsafe.SizeOf<MyTinyStruct>();
Debug.Assert(size <= connection.DatagramMaxSendLength);
TaskCompletionSource tcs = new();
// note: max send length may vary throughout the connection
var array = ArrayPool<byte>.Shared.Rent(size);
try
{
    MemoryMarshal.Cast<byte, MyTinyStruct>(array).SetCurrentGameState();
    // may prefix with a ReadOnlyMemory<byte> of a WebTransport session ID into a ReadOnlySequence<byte>
    connection.SendDatagram(new ReadOnlyMemory<byte>(array, 0, size), new()
    {
        StateChanged = (state, isFinal) =>
        {
            if (isFinal)
            {
                tcs.TrySetResult();
            }
            Console.WriteLine(state);
        };
    });
    await tcs.Task; // wait until it is safe to return the array back to the pool
}
catch when (size > connection.DatagramMaxSendLength)
{
    Console.Error.WriteLine("Datagram max send length reduced, sending canceled.")
}
catch (Exception ex)
{
    Console.Error.WriteLine(ex);
}
finally
{
    ArrayPool<byte>.Shared.Return(array);
}

Alternative Designs

Receiving datagram buffers with a channel (similar to the stream API) was considered, but the MsQuic datagram buffer is merely a pointer to the underlying UDP buffer such that the buffer content is only available during the event callback. Async handler implies unnecessary cloning of possibly a thousand bytes and increase GC pressure for every single datagram received.

Sending with a readonly span was considered for stackalloced buffer, but MsQuic needs to hold on to the memory until the datagram send state becomes final.

dotnet-issue-labeler[bot] commented 3 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

wegylexy commented 3 years ago

@karelz seems the bot has trouble adding area-System.Net.Quic

ghost commented 3 years ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
## Background and Motivation QUIC is finally a proposed standard in RFC, with HTTP/3 and WebTransport on the way. To prepare for WebTransport and other use cases, such as unreliable message delivery in SignalR, .NET should implement QUIC datagram API, as msquic already supports it, to enable higher-level APIs such as WebTransport. Until WebTransport is standardized, it may be used today to stream real-time game state and ultra low-latency voice data where dropped packets should not be retransmitted. Once this is done, SignalR may support new fetures. ## Proposed API ```diff namespace System.Net.Quic { public class QuicClientConnectionOptions { + public bool DatagramReceiveEnabled { get { } set { } } } + public delegate void QuicDatagramReceivedEventHandler(object sender, ReadOnlySpan buffer); public class QuicConnection { + public bool DatagramReceivedEnabled { get { } set { } } + public bool DatagramSendEnabled { get { } set { } } + public ushort DatagramMaxSendLength { get { } } + public event QuicDatagramReceivedEventHandler? DatagramReceived { add { } remove { } } + public ValueTask SendDatagramAsync(ReadOnlyMemory buffer, bool priority = false) { } } public class QuicListenerOptions { + public bool DatagramReceiveEnabled { get { } set { } } } } ``` See https://github.com/wegylexy/runtime/pull/1 for implementation with msquic (tested with v1.3.0). ## Usage Examples ``` C# // receive connection.DatagramReceived += (sender, buffer) => { // Parse the readonly span synchronously, without copying all the bytes, into an async task MyAsyncChannel.Writer.TryWrite(MyZeroCopyHandler.HandleAsync(buffer)); } // send var array = ArrayPool.Shared.Rent(Unsafe.SizeOf()); try { MemoryMarshal.Cast(array).SetCurrentGameState(); await connection.SendDatagramAsync(array); } finally { ArrayPool.Shared.Return(array); } ``` ## Alternative Designs Receiving datagram buffers with a channel (similar to the stream API) was considered, but the msquic datagram buffer is merely a pointer to the underlying UDP buffer such that the buffer content is only available during the event callback. Async handler implies unnecessary cloning of possibly a thousand bytes and increase GC pressure for every single datagram received. Sending with a readonly span was considered for `stackalloc`ed buffer, but msquic needs to hold on to the memory beyond the scope of the method. Sending returns a bool to indicate whether the datagram is acknowledged without spurious loss, or throws when discarded/cancelled. Additional API may be added to handle suspected loss. ## Risks `SendDatagramAsync()` could return something better than a simple `bool` when the task completes.
Author: wegylexy
Assignees: -
Labels: `api-suggestion`, `area-System.Net.Quic`, `untriaged`
Milestone: -
ThadHouse commented 3 years ago

One thing to note about the msquic side of this API is that DatagramMaxSendLength can dynamically change over the course of a connection. It currently can't go down (Although in the future that might change) but it can go up after the connection has been started. Just something to note.

wegylexy commented 3 years ago

@ThadHouse If notification is needed, could implement INotifyPropertyChanged. Otherwise, the getter could be invoked right before filling in the buffer to determine the current max. See also line 327 where the msquic datagram-state-changed event handler updates exactly this property.

ThadHouse commented 3 years ago

I know theres a general rule to not have properties update without user interaction, which is the main reason I brought it up. Not a bug in the implementation, just a note.

wegylexy commented 3 years ago

Besides, Connected (in the current API) and DatagramSendEnabled (in the proposed API) may also update without user interaction, due to peer or transport.

karelz commented 2 years ago

Triage: We will wait for RFC to be finalized (it is not yet to our best knowledge) and for having customer demand. Likely not in 7.0 - perhaps only as preview.

@nibanks do you have insights from MsQuic view point on the RFC / customer demand?

nibanks commented 2 years ago

I expect the datagram spec to be RFC by the time 7.0 is released. I have also had a significant number of folks both internal and external, interested in datagram support; but I don't know how many would be using .NET. IMO, I would try really hard to get datagram supported in 7.0.

wegylexy commented 2 years ago

I'm using my fork of .NET 6 with msquic 1.8 for datagram.

karelz commented 2 years ago

Unless the RFC lands by March or earlier, we won't be able to react in time based on this year experience.

@wegylexy would you be interested in doing a prototype of the API as mentioned above? (Or we could run it by our API review when it is reasonably close to final shape) ... we might not be ready for that right now, but in couple of months it should be feasible from our side.

wegylexy commented 2 years ago

@karelz wegylexy/runtime/.../feature/msquic-datagram is a prototype, forked from a .NET 6 preview when System.Net.Quic was still public.

wegylexy commented 2 years ago

HTTP/3 WebTransport will need HTTP Datagram which requires QUIC Datagram.

wegylexy commented 2 years ago

Alternative design: https://github.com/StirlingLabs/MsQuic.Net/blob/main/StirlingLabs.MsQuic/QuicDatagram.cs

wegylexy commented 2 years ago

Updated to use ReadOnlySequence<byte> to enable zero-copy prefixing the datagram payload with a WebTransport session ID, which is another ReadOnlyMemory<byte> of 1-8 bytes.

A prototype is available at https://github.com/wegylexy/quic-with-datagram and a simple WebTransport server prototype on top of that is coming soon.

wegylexy commented 2 years ago

WebTransport server prototype: https://github.com/wegylexy/webtransport

rzikm commented 2 years ago

Since the QUIC Datagram extension (https://datatracker.ietf.org/doc/rfc9221/) has been officially released as RFC, is it worth trying to get it in for 7.0? @karelz

nibanks commented 2 years ago

Big +1 to exposing the datagram APIs in .NET. I'd love to see that happen in 7.0. Let me know if you need any help from the MsQuic side.

karelz commented 2 years ago

Given the load on the team and our capacity, I don't think we will be able to do anything meaningful in 7.0.

rzikm commented 2 years ago

Some interesting pieces from the RFC to consider during design:

Application protocols that use datagrams MUST define how they react to the absence of the max_datagram_frame_size transport parameter. If datagram support is integral to the application, the application protocol can fail the handshake if the max_datagram_frame_size transport parameter is not present.

=> We need to allow user code to decide whether the connection is aborted when Datagram support is absent

Note that while the max_datagram_frame_size transport parameter places a limit on the maximum size of DATAGRAM frames, that limit can be further reduced by the max_udp_payload_size transport parameter and the Maximum Transmission Unit (MTU) of the path between endpoints. DATAGRAM frames cannot be fragmented; therefore, application protocols need to handle cases where the maximum datagram size is limited by other factors.

This is probably what ThadHouse meant when he said that the maximum datagram send size can change during the lifetime of the connection.

QUIC implementations SHOULD present an API to applications to assign relative priorities to DATAGRAM frames with respect to each other and to QUIC streams.

So far we don't have any way to express prioritization of QuicStreams, so not sure how we would fit Datagrams in the scheme. @nibanks does MsQuic support stream/datagram prioritization?

If a sender detects that a packet containing a specific DATAGRAM frame might have been lost, the implementation MAY notify the application that it believes the datagram was lost.

Similarly, if a packet containing a DATAGRAM frame is acknowledged, the implementation MAY notify the sender application that the datagram was successfully transmitted and received. Due to reordering, this can include a DATAGRAM frame that was thought to be lost but, at a later point, was received and acknowledged.

Putting here just to note that notifying user about datagram loss is not strictly required by the RFC.

nibanks commented 2 years ago

We need to allow user code to decide whether the connection is aborted when Datagram support is absent

We have the QUIC_CONNECTION_EVENT_DATAGRAM_STATE_CHANGED notification that informs you if it's enabled or disabled.

This is probably what ThadHouse meant when he said that the maximum datagram send size can change during the lifetime of the connection.

Correct. QUIC_CONNECTION_EVENT_DATAGRAM_STATE_CHANGED also informs you of the MaxSendLength.

So far we don't have any way to express prioritization of QuicStreams, so not sure how we would fit Datagrams in the scheme. @nibanks does MsQuic support stream/datagram prioritization?

For Streams, we have QUIC_PARAM_STREAM_PRIORITY which is just a uint16_t used to prioritize across streams. Currently, datagrams are always prioritized higher than streams.

Putting here just to note that notifying user about datagram loss is not strictly required by the RFC.

We already have a notification for indicating loss, as a part of QUIC_CONNECTION_EVENT_DATAGRAM_SEND_STATE_CHANGED. It indicates:

typedef enum QUIC_DATAGRAM_SEND_STATE {
    QUIC_DATAGRAM_SEND_UNKNOWN,                         // Not yet sent.
    QUIC_DATAGRAM_SEND_SENT,                            // Sent and awaiting acknowledegment
    QUIC_DATAGRAM_SEND_LOST_SUSPECT,                    // Suspected as lost, but still tracked
    QUIC_DATAGRAM_SEND_LOST_DISCARDED,                  // Lost and not longer being tracked
    QUIC_DATAGRAM_SEND_ACKNOWLEDGED,                    // Acknowledged
    QUIC_DATAGRAM_SEND_ACKNOWLEDGED_SPURIOUS,           // Acknowledged after being suspected lost
    QUIC_DATAGRAM_SEND_CANCELED,                        // Canceled before send
} QUIC_DATAGRAM_SEND_STATE;
wegylexy commented 2 years ago

We need to allow user code to decide whether the connection is aborted when Datagram support is absent

https://github.com/wegylexy/runtime/blob/6abc64405576ba8b740ea1ca7ec9109b956b9455/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs#L949-L959

This is probably what ThadHouse meant when he said that the maximum datagram send size can change during the lifetime of the connection.

https://github.com/wegylexy/runtime/blob/6abc64405576ba8b740ea1ca7ec9109b956b9455/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs#L512-L516

does MsQuic support stream/datagram prioritization?

https://github.com/wegylexy/runtime/blob/6abc64405576ba8b740ea1ca7ec9109b956b9455/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs#L969-L986

notifying user about datagram loss

https://github.com/wegylexy/runtime/blob/6abc64405576ba8b740ea1ca7ec9109b956b9455/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicConnection.cs#L997

wegylexy commented 1 year ago

Rebased my sample implementation onto .NET 7.0 https://github.com/wegylexy/runtime/commit/d4abf8e8fa0d09520d3d5651c41582c585673248

alexrp commented 1 year ago
+        public System.Threading.Tasks.Task<QuicDatagramSendingResult> SendDatagramAsync(ReadOnlyMemory<byte> buffer, bool priority = false) { throw null; }
+        public System.Threading.Tasks.Task<QuicDatagramSendingResult> SendDatagramAsync(System.Buffers.ReadOnlySequence<byte> buffers, bool priority = false) { throw null; }

The allocations implied by these APIs seem likely to be unacceptable to a lot of potential users. The QuicDatagramSendingResult class instance (and the tasks on it) should probably only be allocated if the user opts into doing so for a given call.

wegylexy commented 1 year ago

How about this?

public class DatagramSendOptions
{
    bool Priority { get { throw null; } set { } }
    Action? Sent { get { throw null; } set { } }
    Action? LostSuspect { get { throw null; } set { } }
    Action? Lost { get { throw null; } set { } }
    Action? Acknowledged { get { throw null; } set { } }
}

public Task SendDatagramAsync(ReadOnlyMemory<byte> buffer, DatagramSendOptions options = null);
public Task SendDatagramAsync(ReadOnlySequence<byte> buffer, DatagramSendOptions options = null);

When the task completes, it will be safe to release the buffer.

wegylexy commented 1 year ago

Edited proposed API just now to eliminate Task allocations for sending a datagram. See https://github.com/dotnet/runtime/compare/main...wegylexy:runtime:feature/msquic-datagram#diff-a9b6f4bb4f623894b968d15166375e929e14716f6910febcadd9a13e20a15b46R653-R740

safetonet-github commented 1 year ago

Can anyone review the above for integration and give any feedback? I have significant interest in this and am willing to help. It looks like @wegylexy has done significant work and presently this appears to be on a "sometime in the future" milestone. Seems like an easy integration with high value for next release.

rzikm commented 1 year ago

I don't speak for the whole networking team, but I am interested in this feature as well. We really wanted to take this in 8.0, but we were a bit short-staffed and had to cut some features. We will probably have enough bandwidth to work on this in 9.0 timeframe.

rzikm commented 11 months ago

Related: https://github.com/microsoft/msquic/issues/3906