Open programatik29 opened 3 years ago
Related (but orthogonal) to this great enhancement proposal, I see by default the TCP keepalive is disabled in axum-server
, as TCP keepalive is disabled by default in hyper
. This means dead TCP connections could be left undetected and accumulate over time eventually leading to resource exhaustion with error similar to
ERROR main hyper::server::tcp: accept error: Too many open files (os error 24)
To alleviate the issue, one can enable TCP keepalive with code like:
use axum::{routing::get, Router};
use std::net::SocketAddr;
#[tokio::main]
async fn main() {
let app = Router::new().route("/", get(|| async { "Hello, world!" }));
let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
println!("listening on {}", addr);
axum_server::bind(addr)
.addr_incoming_config(
AddrIncomingConfig::default()
.tcp_keepalive(Some(Duration::from_secs(60)))
.build(),
).serve(app.into_make_service())
.await
.unwrap();
}
So far testing on a host that experienced this problem within a day or two shows very promising behavior.
Furthermore, currently the only configurable property for TCP keepalive in axum_server for incoming/accepted connections is the Keepalive time
as defined at https://en.wikipedia.org/wiki/Keepalive. The other two properties,Keepalive interval
and Keepalive retry
are not currently configurable and are set to None
by default .
Specifically, even though all three parameters are supported by the socket2 crate (used by hyper
), only Keepalive time
is exposed by the hyper crate for configuration via AddrIncoming.html#method.set_keepalive which the axum_server
crate depends upon.
Opportunities for improvement on both hyper
and axum_server
it appears.
Here is a PR for hyper
to make all three TCP Keepalive parameters configurable:
https://github.com/hyperium/hyper/pull/2991
If it was accepted, I would then post the corresponding PR to make this configurable at the axum-server
layer.
Per empirical evidence enabling TCP keepalive with intervals
and retries
drastically/significantly improves the stability of persistent TCP connections while reducing the number of file descriptors via closing down dead connections.
FYI https://github.com/hyperium/hyper/pull/2991 was accepted.
It seems the AddrIncomingConfig
was removed. I don't see any example or documentation to set the tcp_keepalive duration in the latest version. Was that feature removed @programatik29?
Relates to: https://github.com/programatik29/axum-server/pulls
Was that feature removed
Yes, but it can be manually implemented like this:
use axum_server::accept::Accept;
use log::error;
use socket2::{SockRef, TcpKeepalive};
use std::{marker::PhantomData, time::Duration};
use tokio::net::TcpStream;
#[derive(Clone, Copy, Debug, Default)]
pub(crate) struct CustomAcceptor<S, I>(I, PhantomData<S>);
impl<S, I> CustomAcceptor<S, I> {
pub(crate) fn new(inner: I) -> Self {
Self(inner, PhantomData)
}
}
impl<S, I: Accept<TcpStream, S>> axum_server::accept::Accept<TcpStream, S>
for CustomAcceptor<S, I>
{
type Stream = <I as Accept<TcpStream, S>>::Stream;
type Service = <I as Accept<TcpStream, S>>::Service;
type Future = I::Future;
fn accept(&self, stream: TcpStream, service: S) -> I::Future {
if stream.set_nodelay(true).is_err() {
error!("failed to set TCP nodelay");
}
if SockRef::from(&stream)
.set_tcp_keepalive(
&TcpKeepalive::new()
.with_time(Duration::from_secs(10))
.with_interval(Duration::from_secs(10))
.with_retries(2),
)
.is_err()
{
error!("failed to set TCP keepalive");
}
self.0.accept(stream, service)
}
}
fn main() {
let http_server = axum_server::bind{_rustls}(...)
.map(CustomAcceptor::new)
.serve(...);
}
Currently there is no way to shut a connection down except signaling a global shutdown.
Having this ability can be useful to detect slow clients and prevent some attacks.