Closed rgrig closed 4 years ago
Hmm ... I'm confused. This might not be a Websocket issue. TcpStream is in std::net, and it does close connections (https://github.com/steveklabnik/rust/blob/534fd3a98349ec2250e1515a3cb19dbb98935b98/src/libstd/sys/common/net.rs#L569). So, I don't know why I see so many CLOSE_WAIT sockets. I'm closing this because I think it's most likely irrelevant to this project. Sorry.
Maybe you configured Websocat server to initiate connection drops? This way it would hold TIME_WAIT
sockets even after application is terminated.
For example, I expect TIME_WAIT
to accumulate with socat tcp-l:4567,fork,reuseaddr exec:"echo qqq"
.
As for CLOSE_WAIT
, maybe you forgot to specify -E
or -u
option on command line? Is there a warning on Websocat startup?
I'm not sure what command line options you mention. This is a Rust program that uses the websocket crate, roughly in the following way:
let server = websocket::sync::Server::bind("BLA:1234");
for connection in server.filter_map(::std::result::Result::ok) {
match connection.accept() {
Ok(client) => channel.send(client),
Err(_) => (),
}
}
where channel
has the other endpoint in another thread. And the thread code looks roughly like this:
fn go(rx_channel: Receiver<Client<TcpStream>>) -> Result<()> {
loop {
let client = rx_channel.recv()?;
// some code using client (BLABLA)
}
}
Since client
goes out of scope, I expect the connection to be properly closed/cleaned-up, no matter what BLABLA is.
After a long time, I see many CLOSE_WAIT connections in the output of ss -t
.
If you want to see why I'm talking about leaked sockets, see https://blog.cloudflare.com/this-is-strictly-a-violation-of-the-tcp-specification/ for the connection with CLOSE_WAIT.
I'm not sure what command line options you mention.
Sorry, I confused projects. I though it was Websocat.
// some code using client (BLABLA)
It may depend how exactly the client is used. Is it something short or it is a long-lived session?
I expect CLOSE_WAIT
s to accumulate if we are not reading from client for a prolonged time and clients goes away.
Thanks. I can reproduce CLOSE_WAIT
with Recv-Q>0 if I insert an infinite loop in BLABLA.
Perhaps my code gets stuck like that sometimes, although I can't see how.
In any case, this is most likely my bug. Thank you very much for your help!
I have a hypothesis: is it possible that recv_message
(of websocket::client::sync::Client<TcpStream>
) blocks if (a) there is data in the receive buffer but not enough for a full message, and (b) the underlying socket is in CLOSE-WAIT? If so, it would be nice if it would error rather than block.
If socket is CLOSE_WAIT
and you are trying to receive message then I expect those attempst to ultimately call recv
from that socket, which should return error, which should bubble up to user of websocket
.
If needed that case may be checked explicitly. I don't generally use websocket::sync
, so if there are bugs they may go relatively unnoticed.
Shouldn't
TcpStream
implementDrop
to clean-up (close) the underlying socket? Is there any reason not to do so?