tokio-rs / tokio

A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
https://tokio.rs
MIT License
26.5k stars 2.44k forks source link

UdpSocket::try_recv_buf_from not returning entire datagram #3486

Open ramondeklein opened 3 years ago

ramondeklein commented 3 years ago

I am using Tokio v1.1.0 and attempt to use UdpSocket::try_recv_buf_from and use vectored I/O to read the datagram directly in two vectors. I am using the following code:

let udp_socket = UdpSocket::bind(addr).await?;
loop {
    udp_socket.readable().await?;

    let mut key = vec![0u8; 16];
    let mut value = Vec::with_capacity(65_491);
    let mut buffer = (&mut key[..]).chain_mut(&mut value[..]);

    match udp_socket.try_recv_buf_from(&mut buffer) {
        Ok((len, remote_addr)) => {
            /* process data */
        }
        Err(ref e) if e.kind() == std::io::ErrorKind::WouldBlock => {
            continue;
        }
        Err(e) => {
            return Err(e);
        }
    }                
}

The key vector 16 bytes and the value vector is 65.491 bytes so all UDP datagrams should fit in these buffers. When 12.345 bytes are received, the first 16 bytes should go into the key vector and the remaining 12.329 bytes should be stored in the value vector.

Unfortunately, this doesn't work and the behaviour is a bit different on Windows and Linux. Windows returns error-code 10040 (WSAEMSGSIZE) to indicate that the datagram doesn't fit in the buffer. On Linux it only returns the first 16 bytes. When looking at the implementation of try_recv_buf_from it becomes clear why this doesn't work:

pub fn try_recv_buf_from<B: BufMut>(&self, buf: &mut B) -> io::Result<(usize, SocketAddr)> {
    self.io.registration().try_io(Interest::READABLE, || {
        let dst = buf.chunk_mut();
        let dst = unsafe { &mut *(dst as *mut _ as *mut [std::mem::MaybeUninit<u8>] as *mut [u8]) };

        // Safety: We trust `UdpSocket::recv_from` to have filled up `n` bytes in the buffer.
        let (n, addr) = (&*self.io).recv_from(dst)?;
        unsafe { buf.advance_mut(n); }
        Ok((n, addr))
    })
}

An attempt is made to read the entire datagram in the first buffer. Windows reports an error when reading a datagram in a buffer that is too small. Linux reads the data that fits the first buffer and discards the rest of the data.

Vectored I/O cannot be implemented in a wrapper (without data copying), but should be implemented at the OS level. Both Windows and Linux have support for vectored I/O when reading datagrams. On Linux based systems, it should have called recvmsg(2) instead and Windows supports a scatter/gather call as well with WSARecvFrom.

Arnavion commented 3 years ago

libstd's UdpSocket doesn't expose vectored recv either; the only use of recvmsg in libstd is for UnixDatagram. So vectored recv for UdpSocket would need to be implemented in the whole stack - libstd, then mio, then tokio.

Darksonn commented 3 years ago

Yeah, vectored reads for UdpSocket are currently not supported.

@Thomasdezeeuw What are your thoughts on this?