websockets-rs / rust-websocket

A WebSocket (RFC6455) library written in Rust
http://websockets-rs.github.io/rust-websocket/
MIT License
1.55k stars 223 forks source link

leaking sockets #252

Closed rgrig closed 4 years ago

rgrig commented 4 years ago

Shouldn't TcpStream implement Drop to clean-up (close) the underlying socket? Is there any reason not to do so?

rgrig commented 4 years ago

Hmm ... I'm confused. This might not be a Websocket issue. TcpStream is in std::net, and it does close connections (https://github.com/steveklabnik/rust/blob/534fd3a98349ec2250e1515a3cb19dbb98935b98/src/libstd/sys/common/net.rs#L569). So, I don't know why I see so many CLOSE_WAIT sockets. I'm closing this because I think it's most likely irrelevant to this project. Sorry.

vi commented 4 years ago

Maybe you configured Websocat server to initiate connection drops? This way it would hold TIME_WAIT sockets even after application is terminated.

For example, I expect TIME_WAIT to accumulate with socat tcp-l:4567,fork,reuseaddr exec:"echo qqq".

As for CLOSE_WAIT, maybe you forgot to specify -E or -u option on command line? Is there a warning on Websocat startup?

rgrig commented 4 years ago

I'm not sure what command line options you mention. This is a Rust program that uses the websocket crate, roughly in the following way:

let server = websocket::sync::Server::bind("BLA:1234");
for connection in server.filter_map(::std::result::Result::ok) {
  match connection.accept() {
    Ok(client) => channel.send(client),
    Err(_) => (),
  }
}

where channel has the other endpoint in another thread. And the thread code looks roughly like this:

fn go(rx_channel: Receiver<Client<TcpStream>>) -> Result<()> {
  loop {
    let client = rx_channel.recv()?;
    // some code using client (BLABLA)
  }
}

Since client goes out of scope, I expect the connection to be properly closed/cleaned-up, no matter what BLABLA is.

After a long time, I see many CLOSE_WAIT connections in the output of ss -t.

rgrig commented 4 years ago

If you want to see why I'm talking about leaked sockets, see https://blog.cloudflare.com/this-is-strictly-a-violation-of-the-tcp-specification/ for the connection with CLOSE_WAIT.

vi commented 4 years ago

I'm not sure what command line options you mention.

Sorry, I confused projects. I though it was Websocat.


// some code using client (BLABLA)

It may depend how exactly the client is used. Is it something short or it is a long-lived session?

I expect CLOSE_WAITs to accumulate if we are not reading from client for a prolonged time and clients goes away.

rgrig commented 4 years ago

Thanks. I can reproduce CLOSE_WAIT with Recv-Q>0 if I insert an infinite loop in BLABLA. Perhaps my code gets stuck like that sometimes, although I can't see how.

In any case, this is most likely my bug. Thank you very much for your help!

rgrig commented 4 years ago

I have a hypothesis: is it possible that recv_message (of websocket::client::sync::Client<TcpStream>) blocks if (a) there is data in the receive buffer but not enough for a full message, and (b) the underlying socket is in CLOSE-WAIT? If so, it would be nice if it would error rather than block.

vi commented 4 years ago

If socket is CLOSE_WAIT and you are trying to receive message then I expect those attempst to ultimately call recv from that socket, which should return error, which should bubble up to user of websocket.

If needed that case may be checked explicitly. I don't generally use websocket::sync, so if there are bugs they may go relatively unnoticed.