tokio-rs / tokio

A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
https://tokio.rs
MIT License
27.05k stars 2.49k forks source link

peek will block after read #3789

Open nujz opened 3 years ago

nujz commented 3 years ago

Version └── tokio v1.6.0 └── tokio-macros v1.1.0 (proc-macro)

Platform Runing in Windows 64-bit x86_64 GNU/Linux Cross-compilation cargo build --release --target x86_64-pc-windows-gnu

Description I tried this code:

use tokio::io::AsyncReadExt;

#[tokio::main]
async fn main() -> std::io::Result<()> {
    let listener = tokio::net::TcpListener::bind("0.0.0.0:2222").await?;

    loop {
        let (mut socket, _) = listener.accept().await?;

        tokio::spawn(async move {
            let mut buf = [0; 1];
            let _ = socket.read(&mut buf).await;
            let _ = socket.peek(&mut buf).await;
        });
    }
}

telnet 127.0.0.1 2222 It will block in socket.peek

Darksonn commented 3 years ago

If zero bytes are available for reading, then peek will wait for more to arrive.

nujz commented 3 years ago

It works well:

use std::io::Read;

fn main() -> std::io::Result<()> {
    let listener = std::net::TcpListener::bind("0.0.0.0:2222")?;

    loop {
        let (mut socket, _) = listener.accept()?;

        std::thread::spawn(move || {
            let mut buf = [0; 1];
            let _ = socket.read(&mut buf);
            let _ = socket.peek(&mut buf);
        });
    }
}
Darksonn commented 3 years ago

Maybe you can be a bit more specific about the behavior you are observing? Does the problem only appear when you cross-compile to windows?

nujz commented 3 years ago

I try the code using stable-x86_64-pc-windows-msvc in windows. The problem also appear.

  1. Run the code, and port 2222 is listened.
  2. Telnet 2222, press any key twice, and it blocking. Normally it should exit.

The non-async code work well.

Darksonn commented 3 years ago

It works fine on my Linux machine. Maybe it's windows-specific? I know that peek on named pipes behaves ... weirdly, maybe the same holds for a tcp stream.

@udoprog thoughts?

udoprog commented 3 years ago

It does seem to block the client from shutting down as claimed (using ncat) - but not consistently. I usually have to try a few client and when stuck it's definitely stuck on peek. I'll have a closer look after fika!

udoprog commented 3 years ago

So when it blocks, we correctly woke up once on read-readiness. But the attempted call to peek sporadically returned:

Err(Os { code: 10035, kind: WouldBlock, message: "A non-blocking socket operation could not be completed immediately." })

Which should be fine. It would cause us to clear readiness and then wake up again since we subscribe for read readiness immediately after. But something about processing the unsuccessful peek disrupted the next readiness event from firing.

udoprog commented 3 years ago

This makes the issue reproducible on Windows (without ever calling peek). Still works for Linux, so it might very well be an issue in mio w/ polling on Windows: https://github.com/udoprog/tokio/commit/c6c495c273aacf5d0d26b2ea2189cfdf2bc4dac9

Will dig deeper tomorrow unless someone beats me to it.

thooton commented 2 years ago

I think the issue is with mio. This can be shown by running

use std::error::Error;

use mio::net::TcpListener;
use mio::{Events, Interest, Poll, Token};

const SERVER: Token = Token(0);
const STREAM: Token = Token(1);

fn main() -> Result<(), Box<dyn Error>> {
    let mut poll = Poll::new()?;
    let mut events = Events::with_capacity(128);

    let addr = "127.0.0.1:43265".parse()?;
    let mut server = TcpListener::bind(addr)?;
    poll.registry()
        .register(&mut server, SERVER, Interest::READABLE)?;

    let mut stream = 'outer: loop {
        poll.poll(&mut events, None)?;

        for event in events.iter() {
            match event.token() {
                SERVER => {
                    let (stream, _) = server.accept()?;
                    break 'outer stream;
                }
                _ => unreachable!(),
            }
        }
    };

    poll.registry()
        .register(&mut stream, STREAM, Interest::READABLE | Interest::WRITABLE)?;

    loop {
        poll.poll(&mut events, None)?;
        for event in events.iter() {
            if event.token() == STREAM {
                println!("{:?}", event);
            }
        }
    }
}

(with mio features os-poll and net)

Connect to 127.0.0.1 43265 and send messages. On Windows, only two events for read readiness will be received, one on the first message and one on disconnect. This explains why peek blocks after correctly waking up once, because it is waiting for a second that never comes. However, on Linux, a read readiness event will be fired after every message, which causes peek to wake up every message and work correctly.

thooton commented 2 years ago

Here's a workaround that I found which fixes the issue: https://github.com/tokio-rs/tokio/commit/6f46d293dc004d6c351d7d6c8641448979a19025

But I'm not sure why it works. I'm guessing that the application needs to explicitly subscribe to further read readiness messages, and while read does that peek does not.

Sytten commented 9 months ago

We are also seeing this issue with the latest tokio on Windows, peek blocks indefinitely after a read even if there is data available. This seems like a serious bug that should be fixed or documented at least.

EDIT: I am pretty sure the issue is that windows is missing a reregistration when seek would block. This is why the patch that was proposed by @thooton (essentially issuing a read) works because the reads does a do_io which reregister interest on WouldBlock. I tried to propose a fix but it is doesn't seem correct, so I closed it.

What I ended up doing is wrap the TcpStream in a BufReader instead so I can peek as much as my buffer, this is a proper workaround until a real fix is found either in mio or tokio.

Thomasdezeeuw commented 9 months ago

I'm not really following this issue, but I can here from https://github.com/tokio-rs/mio/issues/1755. I do want point out that the Mio code in @thooton comment in https://github.com/tokio-rs/tokio/issues/3789#issuecomment-1173036083 is broken. Furthermore the following statement is incorrect.

However, on Linux, a read readiness event will be fired after every message, which causes peek to wake up every message and work correctly.

Per the documentation, https://docs.rs/mio/latest/mio/struct.Poll.html#draining-readiness, Mio only returns 1 event when the I/O source becomes readable, not per message as stated in the comment. Mio is working as intended.

A working implementation should attempt the read call, and continue to do so until it returns a io::ErrorKind::WouldBlock error. Only once that "error" is returned does Mio guarantee that another event is returned.

However an exception to this is peek. As it never actually reads the bytes from the connection it will never generate another event as the bytes are never removed from the connections' queue. For this to work you'll have to actually read the bytes.

Sytten commented 9 months ago

Per the documentation, https://docs.rs/mio/latest/mio/struct.Poll.html#draining-readiness, Mio only returns 1 event when the I/O source becomes readable, not per message as stated in the comment. Mio is working as intended.

The fact remains that the peek on unix platforms works as intended but not on windows so there is a difference somewhere and I could not find platform specific code on the tokio side that would explain it. Thus my assumption that mio was the problem, it was further confirmed when adding a do_io fixed the issue.

A working implementation should attempt the read call, and continue to do so until it returns a io::ErrorKind::WouldBlock error. Only once that "error" is returned does Mio guarantee that another event is returned.

This is indeed what I saw.

However an exception to this is peek. As it never actually reads the bytes from the connection it will never generate another event as the bytes are never removed from the connections' queue. For this to work you'll have to actually read the bytes.

But the peek does receive a io::ErrorKind::WouldBlock here because there is no data to read, so I think it is fair for the consumer in this case to expect an event from mio once there is more data available. Otherwise how is the consumer ever supposed to know? Maybe tokio is not using mio correctly, but from my reading of the code it is pretty aggressive in clearing the readiness and parking the IO thread.

My main grip here is that it is unclear who wants to take responsibility of the issue and it makes the whole thing confusing for a first time contributor. To me linux and macos are working as "expected", but this might actually be an "undesirable" side-effect and the only platform that respects the documentation is actually Windows.