dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.45k stars 4.76k forks source link

[API Proposal]: Simple, modern TCP APIs #63162

Open geoffkizer opened 2 years ago

geoffkizer commented 2 years ago

Background and motivation

When writing a TCP client or server, you have a choice of two APIs today: Socket or TcpClient/TcpListener.

You can use the Socket class directly. However, Socket is a low-level, general-purpose API that isn’t specific to TCP – it supports UDP, Unix Domain Sockets, raw sockets, and arbitrary socket types. Using it requires understanding concepts like AddressFamily, SocketType, ProtocolType, dual mode sockets, the EndPoint abstract base class and derived IPEndPoint class, how and when to use Bind, which Socket APIs work for TCP (Send/Receive) vs disconnected UDP (SendTo, ReceiveFrom), etc. If you are already familiar with Sockets, this is not a big deal – though note I’ve seen us mess some of this up in our own code, e.g. not enabling dual mode sockets properly because we used the wrong Socket constructor. If you are not already familiar with Sockets, and just want to write some basic TCP client or server code, then understanding Sockets is an unnecessary barrier to entry.

Alternatively, you can use TcpClient and TcpListener, which provide a higher-level, TCP-specific API. These simplify creating and accepting TCP connections and provide some convenience APIs for common TCP tasks like setting NoDelay, controlling LingerState, or specifying a local IP and port to use when connecting.

Unfortunately, TcpClient is an awful API.

TcpClient is an old-style “create-set-use” API. You create an instance, configure it, and then call Connect[Async] to actually establish the connection. You then retrieve the associated NetworkStream using GetStream().

The overall TCP connection functionality is split between NetworkStream and TcpClient, with some (but not all) functionality in both places. Want to read or write? Use NetworkStream. Want to set NoDelay, or configure the send and receive buffer sizes? Use TcpClient. Want to set a read or write timeout? You can use either.

Want to close the connection? You can use either. Both NetworkStream and TcpClient implement IDisposable and also have a Close method. Either one will dispose both the NetworkStream and the TcpClient, but this is not at all obvious – and in fact the docs for GetStream() get this wrong; see #63154.

Even worse, TcpClient is finalizable even though its finalizer simply calls Dispose(false), which, unless you override it, does nothing. Socket itself is finalizable and so classes that use it, like TcpClient and NetworkStream, don’t need to be finalizable themselves.

Even worse, some basic TCP functionality is missing from both TcpClient and NetworkStream. There is no way to shutdown the connection, no access to local or remote endpoints, no way to configure TCP keep-alive.

On top of that, some of the TcpClient APIs are just confusing. The property you use to access the underlying socket is called “Client”… why? Why not “Socket”?? One constructor takes a hostname and port and performs a (synchronous) Connect for you; another takes an IPEndPoint, but instead of performing the Connect, it uses this as the local endpoint for the connection.

Even more confusing, TcpClient is also used in TCP server scenarios. TcpListener has AcceptTcpClient[Async] methods that return a TcpClient instance to represent the accepted connection. This allows you to configure NoDelay and get the associated NetworkStream. Or even access the underlying socket using the Client property, even though you’re a server… gahhhhhhh.

We shouldn’t have two types that each incompletely represent a TCP connection. We should have a single type that represents a TCP connection and allows you to perform all common TCP connection operations. And we should have simple APIs that return this type for TCP client scenarios (Connect) and TCP server scenarios (Listen/Accept).

API Proposal

(Note this is a general sketch and is not intended to include all potential overloads, optional params, etc.)

namespace System.Net.Sockets
{
    // New class
    public sealed class TcpConnection : NetworkStream
    {
        // Create from existing Socket. Socket must be connected. TcpConnection takes ownership.
        public TcpConnection(Socket socket);

        public IPAddress LocalAddress { get; }
        public int LocalPort { get; }
        public IPAddress RemoteAddress { get; }
        public int RemotePort { get; }

        public bool NoDelay { get; set; }

        public int SendBufferSize { get; set; }
        public int ReceiveBufferSize { get; set; }

        public void Shutdown(SocketShutdown how);

        public Socket Socket { get; }

        // Read[Async], Write[Async], and Close are inherited from NetworkStream
        // Other possible additions, now or in the future: TCP keep-alive, LingerState
    }

    // New class
    public static class Tcp
    {
        public static TcpConnection Connect(IPAddress address, int port);
        public static ValueTask<TcpConnection> ConnectAsync(IPAddress address, int port, CancellationToken cancellationToken = default);
        public static TcpConnection Connect(string hostname, int port);
        public static ValueTask<TcpConnection> ConnectAsync(string hostname, int port, CancellationToken cancellationToken = default);

        // Note, the returned TcpListener is already started
        public static TcpListener Listen(IPAddress address, int port, int backlog = 100);
    }

    // Existing class
    public class TcpListener
    {
        public TcpConnection AcceptConnection();
        public ValueTask<TcpConnection> AcceptConnectionAsync(CancellationToken cancellationToken = default);
    }    
}

The following are obsoleted: (1) TcpClient class (2) TcpListener.AcceptTcpClient[Async], Start(), existing constructors, etc.

API Usage

Client example:

    TcpConnection connection = await Tcp.ConnectAsync("www.microsoft.com", 80);
    Console.WriteLine($"Established connection from {connection.LocalAddress}:{connection.LocalPort} to {connection.RemoteAddress}:{connection.RemotePort}");

    // Do something with the connection

Server example:

    TcpListener listener = Tcp.Listen(IPAddress.Any, 80);
    Console.WriteLine($"Server listening on {listener.ListenAddress()}:{listener.ListenPort()}");

    while (true)
    {
        TcpConnection connection = await listener.AcceptConnectionAsync();
        Console.WriteLine($"Accepted connection from {connection.RemoteAddress}:{connection.RemotePort} to {connection.LocalAddress}:{connection.RemotePort}");

        // Do something with the connection
    }

Example for both client and server:

    using (TcpConnection connection = ...)
    {
        // Configure TCP connection before we use it
        connection.NoDelay = true;
        connection.ReceiveBufferSize = MyReceiveBufferSize;
        connection.SendBufferSize = MySendBufferSize;

        // Perform reads and writes here

        // Half-close the connection
        connection.Shutdown(SocketShutdown.Send);

        // Read any remaining data from peer
    }

Alternative Designs

Some of the methods/properties defined on TcpConnection above may make more sense on NetworkStream, since they are not specific to TCP. E.g. Send/ReceiveBufferSize. Since TcpConnection derives from NetworkStream, these will be available to users of TcpConnection either way.

Another alternative is to just obsolete TcpClient and TcpListener entirely and tell users to always use Sockets. I don't think this is ideal; I think there's value in having simple APIs specifically for TCP. But if we don't think there is, then let's please obsolete the existing awful APIs and point people in the right direction.

Related

We should consider similar API updates for SSL, UDP, and Unix Domain Sockets.

See also these related TcpListener issues: #63114, #63115, #63117

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation When writing a TCP client or server, you have a choice of two APIs today: Socket or TcpClient/TcpListener. You can use the Socket class directly. However, Socket is a low-level, general-purpose API that isn’t specific to TCP – it supports UDP, Unix Domain Sockets, raw sockets, and arbitrary socket types. Using it requires understanding concepts like AddressFamily, SocketType, ProtocolType, dual mode sockets, the EndPoint abstract base class and derived IPEndPoint class, how and when to use Bind, which Socket APIs work for TCP (Send/Receive) vs disconnected UDP (SendTo, ReceiveFrom), etc. If you are already familiar with Sockets, this is not a big deal – though note I’ve seen us mess some of this up in our own code, e.g. not enabling dual mode sockets properly because we used the wrong Socket constructor. If you are not already familiar with Sockets, and just want to write some basic TCP client or server code, then understanding Sockets is an unnecessary barrier to making progress. Alternatively, you can use TcpClient and TcpListener, which provide a higher-level, TCP-specific API. These simplify creating and accepting TCP connections and provide some convenience APIs for common TCP tasks like setting NoDelay, controlling LingerState, or specifying a local IP and port to use when connecting. Unfortunately, TcpClient is an awful API. TcpClient is an old-style “create-set-use” API. You create an instance, configure it, and then call Connect[Async] to actually establish the connection. You then retrieve the associated NetworkStream using GetStream(). The overall TCP connection functionality is split between NetworkStream and TcpClient, with some (but not all) functionality in both places. Want to read or write? Use NetworkStream. Want to set NoDelay, or configure the send and receive buffer sizes? Use TcpClient. Want to set a read or write timeout? You can use either. Want to close the connection? You can use either. Both NetworkStream and TcpClient implement IDisposable and also have a Close method. Either one will dispose both the NetworkStream and the TcpClient, but this is not at all obvious – and in fact the **docs for GetStream() get this wrong**; see #63154. Even worse, TcpClient is finalizable even though its finalizer simply calls Dispose(false), which, unless you override it, does nothing. Socket itself is finalizable and so classes that use it, like TcpClient and NetworkStream, don’t need to be finalizable themselves. Even **worse**, some basic TCP functionality is missing from both TcpClient and NetworkStream. There is no way to shutdown the connection, no access to local or remote endpoints, no way to configure TCP keep-alive. On top of that, some of the TcpClient APIs are just confusing. The property you use to access the underlying socket is called “Client”… why? Why not “Socket”?? One constructor takes a hostname and port and performs a (synchronous) Connect for you; another takes an IPEndPoint, but instead of performing the Connect, it uses this as the local endpoint for the connection. Even **more** confusing, TcpClient is also used in TCP server scenarios. TcpListener has AcceptTcpClient[Async] methods that return a TcpClient instance to represent the accepted connection. This allows you to configure NoDelay and get the associated NetworkStream. Or even access the underlying socket using the Client property, even though you’re a server… gahhhhhhh. We shouldn’t have two types that each incompletely represent a TCP connection. We should have a single type that represents a TCP connection and allows you to perform all common TCP connection operations. And we should have simple APIs that return this type for TCP client scenarios (Connect) and TCP server scenarios (Listen/Accept). ### API Proposal (Note this is a general sketch and is not intended to include all potential overloads, optional params, etc.) ```C# namespace System.Net.Sockets { // New class public sealed class TcpConnection : NetworkStream { // Create from existing Socket. Socket must be connected. TcpConnection takes ownership. public TcpConnection(Socket socket); public IPAddress LocalAddress { get; } public int LocalPort { get; } public IPAddress RemoteAddress { get; } public int RemotePort { get; } public bool NoDelay { get; set; } public int SendBufferSize { get; set; } public int ReceiveBufferSize { get; set; } public void Shutdown(SocketShutdown how); public Socket Socket { get; } // Read[Async], Write[Async], and Close are inherited from NetworkStream // Other possible additions, now or in the future: TCP keep-alive, LingerState } // New class public static class Tcp { public static TcpConnection Connect(IPAddress address, int port); public static ValueTask ConnectAsync(IPAddress address, int port, CancellationToken cancellationToken = default); public static TcpConnection Connect(string hostname, int port); public static ValueTask ConnectAsync(string hostname, int port, CancellationToken cancellationToken = default); // Note, the returned TcpListener is already started public static TcpListener Listen(IPAddress address, int port, int backlog = 100); } // Existing class public class TcpListener { public TcpConnection AcceptConnection(); public ValueTask AcceptConnectionAsync(CancellationToken cancellationToken = default); } } ``` The following are obsoleted: (1) TcpClient class (2) TcpListener.AcceptTcpClient[Async], Start(), existing constructors, etc. ### API Usage Client example: ```C# TcpConnection connection = await Tcp.ConnectAsync("www.microsoft.com", 80); Console.WriteLine($"Established connection from {connection.LocalIPAddress}:{connection.LocalPort} to {connection.RemoteIPAddress}:{connection.RemotePort}"); // Do something with the connection ``` Server example: ```C# TcpListener listener = Tcp.Listen(IPAddress.Any, 80); Console.WriteLine($"Server listening on {listener.ListenIPAddress()}:{listener.ListenPort()}"); while (true) { TcpConnection connection = await listener.AcceptTcpConnectionAsync(); Console.WriteLine($"Accepted connection from {connection.RemoteIPAddress}:{connection.RemotePort} to {connection.LocalIPAddress}:{connection.RemotePort}"); // Do something with the connection } ``` Example for both client and server: ```C# using (TcpConnection connection = ...) { // Configure TCP connection before we use it connection.NoDelay = true; connection.ReceiveBufferSize = MyReceiveBufferSize; connection.SendBufferSize = MySendBufferSize; // Perform reads and writes here // Half-close the connection connection.Shutdown(SocketShutdown.Send); // Read any remaining data from peer } ``` ### Alternative Designs Some of the methods/properties defined on TcpConnection above may make more sense on NetworkStream, since they are not specific to TCP. E.g. Send/ReceiveBufferSize. Since TcpConnection derives from NetworkStream, these will be available to users of TcpConnection either way. ### Risks _No response_
Author: geoffkizer
Assignees: -
Labels: `api-suggestion`, `area-System.Net.Sockets`
Milestone: -
huoyaoyuan commented 2 years ago

NetworkStream is not hard to use in simple scenarios. I'd expect more support for Udp.

filipnavara commented 2 years ago

Side note: I think you should look into what Apple did with the Network.framework APIs. They aimed to simplify similar scenarios but also have a concept of layering to implement things like TLS or application-level protocols. The main reason I am suggesting that though is that on newer Apple platforms the Network.framework is the only supported API [bundled with the system] for establishing TLS (over UDP/TCP) connections. The current SslStream API cannot easily be implemented using the Network.framework APIs because it allows arbitrary underlying transport and the Apple API does not. There are workarounds for client-side scenarios (albeit crude and inefficient, such as creating dummy domain socket and then layering a custom transport over it to redirect the raw TLS stream into appropriate API layer). However, there are no workarounds for server-side scenarios. It would be nice if any new abstraction of TCP connections was built with the restrictions in mind to allow implementation of both TCP and TLS-over-TCP through the Apple APIs.

gfoidl commented 2 years ago

public static class Tcp

Instead of a static class should this be something like public class TcpBuilder : ITcpBuilder? (naming may be better, of course). Motivation against the static class is testability of the components that will use this type.

svick commented 2 years ago
public sealed class TcpConnection : NetworkStream

Note that this violates Framework Design Guidelines, which say:

Base Type Derived/Implementing Type Guideline
System.IO.Stream ✔️ DO add the suffix "Stream."
gfoidl commented 2 years ago

Edit: please ignore this comment

~NetworkStream already exists in the framework for a long time -- so it's too late to change this. Or did you have something different on focus here?~

alexrp commented 2 years ago

NetworkStream already exists in the framework for a long time -- so it's too late to change this. Or did you have something different on focus here?

NetworkStream is fine. The point is that, to comply with the guidelines, this proposal should be TcpConnectionStream, TcpStream, or something along those lines, not TcpConnection.

(Though I tend to agree with @filipnavara that extending NetworkStream is fundamentally not the way to go for a "modern" TCP API.)

FiniteReality commented 2 years ago

IMHO if a "modern" TCP API was added, it should operate similarly to the types in the System.IO.Pipelines namespace. That is, the read/write endpoints should be separate by default, and a "combined" interface (using IDuplexPipe) can be passed around for those needing both read and write access. This "combined" interface could have an API for retrieving a Stream for consumers needing legacy stream-based read-write access (e.g. SSL) but it should default to keeping them separate using the newer, more efficient primitives.

zlatanov commented 2 years ago

@geoffkizer doesn't https://github.com/dotnet/runtime/issues/1793 already cover this?

filipnavara commented 2 years ago

At first sight #1793 is much more like the Apple API model.

antonfirsov commented 2 years ago

Though I mostly agree with the proposal, I think we should investigate synergies with #1793, and potentially merge the two.

If - for some reason - we decide to implement Connection Abstractions for let's say .NET 8, we'll end up having two sets of very similar API's for TCP connection creation, which will generate some confusion.

Alternatively we may decide that we don't want #1793 to be part of the BCL and close it.

scalablecory commented 2 years ago

Can we see some side-by-side examples where this will have the best impact?

GSPP commented 2 years ago

Unfortunately, TcpClient is an awful API.

I find your rant very refreshing 😆. So I'm not the only one who finds TcpClient nasty.


  1. Maybe those IP/port pairs should be represented with IPEndPoint?
  2. I think backlog = 100 shouldn't default to some arbitrary value. The behavior should be what existing APIs do. I believe, they use the OS default.
  3. Do we really want to carry forward the nastiness brought by TcpListener? If we take your criticism seriously (we should) then this class is unsalvagable. Reusing it will just result in an even more involved, weirdly coupled API surface.
  4. The synchronous connect APIs could be made to support cancellation as well. With Socket, you can cancel that by closing the socket. With this new factory-style APIs there is no way to cancel.
  5. Should those connect APIs support a timeout (as TimeSpan, please no more int milliseconds)? Otherwise, you'd have to create a CancellationTokenSource just for the timeout.
tactical-drone commented 2 years ago

The overall Socket API is bad for out of the box use.

These are the kinds of contraptions one has to create on top of .net sockets currently to normalize the design:

https://github.com/tactical-drone/zero/tree/net-standard/zero.core/network/ip

The amount of work seems excessive.

Then there are still other problems. Having to new CancellationTokenSource(timeout).Token every time you want to send with timeout does not make sense for example. So that entire API needs a sanity check I think.

wfurt commented 2 years ago

4. The synchronous connect APIs could be made to support cancellation as well. With Socket, you can cancel that by closing the socket. With this new factory-style APIs there is no way to cancel.

That is not necessarily true. Since we use SafeHandle under the cover Close does not really work while in connect system call - at least on Linux. And AFAIK we did not mix sync methods with CancellationToken so far. For consistency, it may be best if synchronous methods depend on timeout values while *Async use the token. However, through some other discussion I feel there is sentiment for old good timeouts even in Async as the exceptions are sometimes hard to decode and troubleshoot.

One that note, we tend to provide synchronous versions for compatibility. If anything, I would question that for the new API. There is significant complication (at least on Unix) for mixing the modes.

wfurt commented 2 years ago

I did look at the link @tactical-drone but lot of the code does not make sense to me - like Poll before WriteAsync. It would be more useful IMHO to describe what problems you trying to solve.

I did more searching within runtime and found TlsStream that wraps SslStream as NetworkStream and then NetworkStreamWrapper that wraps transparently either one. If that is more common it would support what @filipnavara suggested. I would not mind do blend it with https://github.com/dotnet/runtime/issues/63663.

That other proposal also brings concept of connect policy. People trying to prefer given address family, or some way how to impact interface or address used.

 [Flags]
    public enum ConnectOptions
    {
       Default = 0,
       IPv4Only,
       IPv6Only,
       Parallel, // Happy-Eyeball
       FastOpen,
       ...
    }

I'm wondering if anybody here would find it interesting. For example, the Happy-Eyeball is on todo list for a while. But instead of bolting it to Sockets we could put it here as added value.

wfurt commented 2 years ago

Lastly, the https://github.com/dotnet/runtime/issues/1793 is dead as right now. I think it may be because it tried to solve everything. While related, I see this proposal as way how to make writing simple networking endpoints easier. In order to do that, I think we should focus on simplicity and consistency. If combined with TLS it may go beyond the TCP @geoffkizer envisioned but we can still probably keep it simple.

wfurt commented 2 years ago
    public static ValueTask<TcpConnection> ConnectAsync(string hostname, int port, SslClientAuthenticationOptions? sslOptions = null, CancellationToken cancellationToken = default);
    public ValueTask<TcpConnection> AcceptConnectionAsync(SslServerAuthenticationOptions? sslOptions = null, CancellationToken cancellationToken = default);

that would optionally negotiate TLS if sslOptions is not null. (The server may be little bit more complicated)

antonfirsov commented 1 year ago

A few thoughts:

  1. I agree with https://github.com/dotnet/runtime/issues/63162#issuecomment-1002227389. TcpConnection should be TcpStream to conform to existing conventions.

  2. Instead of exposing TCP settings (NoDelay, SendBufferSize, ReceiveBufferSize ...) on TcpStream, we should consider encapsulating them in a separate TcpConnectionOptions type the same way we do in QUIC. This would achieve API symmetry, discoverability of options and also make sure the socket is configured upfront:


public sealed class TcpConnectionOptions
{
        public IPEndPoint RemoteEndPoint { get; set; }

        public bool NoDelay { get; set; }       
        public int SendBufferSize { get; set; }
        public int ReceiveBufferSize { get; set; }

        // APIs for TCP Keepalive etc.
}

public static class Tcp
{
    ValueTask<TcpStream> ConnectAsync(TcpConnectionOptions options, CancellationToken cancellationToken = default);
}
  1. @wfurt regarding https://github.com/dotnet/runtime/issues/63162#issuecomment-1097539876:

This would require merging System.Net.Sockets.dll with System.Net.Security.dll, which is not possible. Instead, we can consider a variant of #63663, that would provide an easy way to "convert" an existing TcpStream into an SslStream instead of creating it from scratch. This would enable layering things with simple API-s:

namespace System.Net.Security;

public static class NetworkStreamExtensions
{
    public static async ValueTask<SslStream> AuthenticateAsClientSslStream(this NetworkStream networkStream, SslClientAuthenticationOptions options);
}