[libs] Add `TcpStream::keepalive` and `TcpStream::set_keepalive`

yoshuawuyts commented 4 years ago

Today an issue was filed in async-std asking how to set the SO_KEEPALIVE option on our TcpStream. Because we've only implemented methods that are available in std as well, we never added these methods. But I think there's a strong case to add these methods to std::net::TcpStream.

TcpStream already supports lots of control methods natively including: set_read_timeout, set_write_timeout, set_ttl, and set_nodelay. set_keepalive would be a method along the same lines providing a convenient way to set yet another low-level option. And there's a clear need for it: at least 3 other low-level TCP related libraries expose keepalive and set_keepalive (see the "Prior Art" section for more on this).

Implementation

The implementation can probably entirely copied from net2::TcpStreamExt. Unlike some of the other methods in that genre, these methods can operate entirely on an already instantiated socket.

Also between all existing implementations that I've found, there's consensus that the shape of the methods should be as follows:

impl TcpStream {
    fn set_keepalive(&self, keepalive: Option<Duration>) -> Result<()>;
    fn keepalive(&self) -> Result<Option<Duration>>;
}

Prior Art

net2::TcpStreamExt exposes both keepalive and set_keepalive.
tokio::net::TcpStream exposes both keepalive and set_keepalive.
socket2::Socket exposes both keepalive and set_keepalive.

Prior Discussion

I couldn't find any references to TcpStream::{keepalive,set_keepalive} directly but https://github.com/rust-lang/rust/issues/27773 seems to touch on the general topic. Also https://github.com/rust-lang/rust/pull/28339 introduced many of the socket controls mentioned earlier, but doesn't seem to mention keepalive either.

edit: I've since found https://github.com/rust-lang/rust/pull/31945#issuecomment-189784913 and https://github.com/rust-lang/rust/pull/24594 which mention TcpStream::keepalive. In particular https://github.com/rust-lang/rust/pull/31945#issuecomment-189784913 mentions:

We should probably split the keepalive functionality into two methods to mirror how things work, (...) but that should be worked out in the net2 crate.

Discussion seems to have been started in https://github.com/rust-lang-nursery/net2-rs/issues/29, but hasn't progressed since 2016.

Drawbacks

I don't think there are any. Back in 2015 it seems these methods weren't added as part of Rust 1.0 because the libs team wanted to be cautious of what to include. But I think in the years since it's proven to be a useful addition that's seen a lot of use throughout the ecosystem, and now might be a good time to consider adding these to std.

yoshuawuyts commented 3 years ago

To recap: the current limitation here is that Windows doesn't seem to provide a method for retrieving the socket timeout. For example this code:

let stream = TcpStream::connect("104.198.14.52:80")?;
stream.set_keepalive(Some(Duration::from_secs(1)))?;
println!("{:?}", stream.keepalive()?);

Will print None on Windows, and Some(1s) on Linux (repro). This is the reason why this shape of API can't be introduced as-is today, and discussion is mostly about how to work around this (ref).

yoshuawuyts commented 3 years ago

Note that SO_LINGER does seem to work as expected, and afaict there's no good reason not to introduce the obvious API.

ibraheemdev commented 2 years ago

linger and set_linger is now available under #![feature(tcp_linger)] (#88494).

brandonros commented 2 years ago

Did Tokio get rid of set_keepalive for TcpStream on v1.0 on purpose?

gzmorell commented 2 years ago

Hi, set_keepalive is an important feature. It lets a TcpStream detect a disconnection due to a network problem occurring before the TcpStream being correctly closed from any side. Without this function the TcpStream will not be closed till the program finish. In a server running forever this can cause problems like not freed resources. Furthermore if the opened TcpStreams are limited this can cause a server not being able to accept connections anymore. It is possible to implement some "supervisor" to check if there is no traffic, and closing the connection in that case, but it seems to reinvent the wheel, having it in the core sockets. Both tokio and std TcpStreams do not implement this function. Socket2 has an implementation. So it is posible to do:

let stream : tokio::net::TcpStream =  listener.accpt().await?;
let stream: std::net::TcpStream = stream.into_std().unwrap();
let socket: socket2::Socket = socket2::Socket::from(stream);
let keepalive = TcpKeepalive::new()
                .with_time(Duration::from_secs(4))
                .with_interval(Duration::from_secs(1))
                .with_retries(4);
socket.set_tcp_keepalive(&keepalive)?;
let stream: std::net::TcpStream = socket.into();
let stream: tokio::net::TcpStream = tokio::net::TcpStream::from_std(stream).unwrap();

instead of

let stream : tokio::net::TcpStream =  listener.accpt().await?;
let keepalive = TcpKeepalive::new()
                .with_time(Duration::from_secs(4))
                .with_interval(Duration::from_secs(1))
                .with_retries(4);
stream.set_tcp_keepalive(&keepalive)?;

By the way TcpStream has a lot of functions which are marked as "Only for OS..." Apart from this function not being available in some OS is there anything else which stops implementation?

BarbossHack commented 2 years ago

I really want to see set_keepalive in std::net::TcpStream

But you should also take a look at set_tcp_user_timeout, it's an important feature to add in conjunction with TCP keepalives. You can read this blog post from Cloudflare (or CodeArcana one) explaining why TCP keepalives aren't enough in most cases, and why TCP_USER_TIMEOUT is also important when using TCP keepalives !

socket2::Socket::set_tcp_user_timeout()

BarbossHack commented 2 years ago

Did Tokio get rid of set_keepalive for TcpStream on v1.0 on purpose?

I found the PR that removed set_keepalive from tokio, io: update to Mio 0.7, which is based on this PR where we can find an explanation :

Temporarily removed methods

The following TcpStream methods were removed due to changes in the Mio API. They will be reintroduced in further changes that either implement these directly in Tokio, or Mio introducing a TcpSocket builder type that can be used by Tokio.

TcpStream::

...

keepalive

set_keepalive

...

But why were they removed from Mio ? The answer can be found in this commit :

Remove TcpSocket type

The socket2 crate provide all the functionality and more. Furthermore supporting all socket options is beyond the scope of Mio.

The easier migration is to the socket2 crate, using the Socket or SockRef types.

The migration for Tokio is tracked in https://github.com/tokio-rs/tokio/issues/4135.

So I think these methods have just been forgotten since then. The good point is that TCP keepalive support wasn't removed because it would be a bad idea, it was removed because socket2 already support them better

gzmorell commented 2 years ago

I really want to see set_keepalive in std::net::TcpStream

But you should also take a look at set_tcp_user_timeout, it's an important feature to add in conjunction with TCP keepalives. You can read this blog post from Cloudflare (or CodeArcana one) explaining why TCP keepalives aren't enough in most cases, and why TCP_USER_TIMEOUT is also important when using TCP keepalives !

socket2::Socket::set_tcp_user_timeout()

Really interesting, thanks

haddow777 commented 1 year ago

Hello. I would like to add my two bits into the add keepalive debate camp. I've been working in a number of languages recently. C/C++ and Flutter both have keepalive functionality. Heck, Websockets.io for the web and Node have keepalive functionality. Not many network paradigms nowadays call for continuous connections, but in cases where they are desired, keepalive is a vital element. I have a project currently that uses TCP to communicate between mobile apps, controllers like an esp32, and a remote server. Just testing the communication connections between the app and the controller showed a strong need for keepalive functionality. Maintaining connections across power failures, WiFi disconnects, killed App commands, etc wreaked havoc on reconnections until I managed to add keepalive. Right now, the controller is sending log data to a Rust server, which just leaves connections forever waiting for a read even as the same controller opens a new socket on the server after rebooting from a power failure. Over time, the server will just build up a pile of failed connections without that functionality. I know there are other methodologies, but is that really the point? TCP provides continuous connections. Keepalive is completely necessary for TCP to function properly in that situation. I don't know about anybody else, but in my mind, if you have a TCP library, it just needs to have keepalive functionality. Anyways, that's my two cents. There is interest for the functionality.

haydenflinner commented 8 months ago

Kind of surprised this is still pending years later, is a PR all that's missing to make this happen?

Qix- commented 7 months ago

I think the question is how to handle the case of Windows. But if there was a way to perhaps import a trait from one of the std::os::unix module - perhaps std::os::unix::net::TcpStreamExt - that had the set_keepalive()/keepalive() methods, that'd be acceptable in a large percentage of cases while the longer term solution is worked out.

Does that make sense to do? If so, I'd also be interested in a PR.

rust-lang / rust