chriskohlhoff / asio

Asio C++ Library
http://think-async.com/Asio
4.81k stars 1.2k forks source link

io_context.run() sometimes blocks for about 20 seconds after socket cancel() on Windows #1274

Open phamelin opened 1 year ago

phamelin commented 1 year ago

Problem description

Using a TCP client, I need to read N bytes and return after some timeout if not all bytes have been received. Here is the code:

#include <array>
#include <boost/asio.hpp>
#include <boost/system/system_error.hpp>
#include <chrono>
#include <iostream>
#include <string>

using boost::asio::ip::tcp;

class Client
{
public:
    void connect(const std::string& host, const std::string& service,
        std::chrono::steady_clock::duration timeout)
    {
        auto endpoints = tcp::resolver(io_context_).resolve(host, service);

        boost::system::error_code error;
        boost::asio::async_connect(socket_, endpoints,
            [&](const boost::system::error_code& result_error,
                const tcp::endpoint& /*result_endpoint*/)
            {
                error = result_error;
            });

        run(timeout);

        if(error)
            throw std::system_error(error);
    }

    size_t read(char* buffer, size_t buffer_size,
        std::chrono::steady_clock::duration timeout)
    {
        boost::system::error_code error;
        size_t n = 0;

        boost::asio::async_read(socket_, boost::asio::buffer(buffer, buffer_size),
            [&](const boost::system::error_code& result_error,
                std::size_t result_n)
            {
                // Ignore error if it's operation canceled
                if(result_error.value() !=  boost::asio::error::operation_aborted)
                {
                    error = result_error;
                }
                n = result_n;
            });

        run(timeout);

        if(error)
            throw std::system_error(error);

        return n;
    }

private:

    void run(std::chrono::steady_clock::duration timeout)
    {
        io_context_.restart();
        io_context_.run_for(timeout);
        if(!io_context_.stopped())
        {
            auto start = std::chrono::steady_clock::now();
            socket_.cancel();
            io_context_.run();
            auto end = std::chrono::steady_clock::now();
            if((end-start) > timeout)
            {
                std::cout << "cancel took " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()
                    << "ms." << std::endl;
            }
        }
    }

    boost::asio::io_context io_context_;
    tcp::socket socket_{io_context_};
};

//----------------------------------------------------------------------

int main(int argc, char* argv[])
{
    if(argc != 3)
    {
        std::cerr << "Usage: " << argv[0] << " <host> <port>" << std::endl;
        return -1;
    }

    Client c;
    c.connect(argv[1], argv[2], std::chrono::seconds(10));
    std::array<char, 128> buffer;
    unsigned int count=0;
    for(;;)
    {
        size_t n = c.read(&buffer[0], buffer.size(), std::chrono::milliseconds(10));
        if(n > 0)
            std::cout << "[" << count++ << "] " << std::string(&buffer[0], n);
    }

    return 0;
}

This is very similar to the blocking tcp client example, with the following differences:

It works fine on Linux. However, on Windows the io_context.run() sometimes (about once per 2-3 minutes) blocks for about 20 seconds after socket cancel(). I can reproduce the problem only if the TCP server is on a different machine (localhost to localhost does not seem to exhibit the problem).

How to reproduce

  1. Start a TCP server on a different machine, e.g.: while sleep 1; do echo "hello"; done | nc -l 20000
  2. On the Windows host launch the program below.
  3. Each 2-3 minutes, you should see the message, e.g. "cancel took 20405 ms".

Environment Boost Asio version 1.79.0 Windows 10, MSVC 2017 (tcp client) Ubuntu 20.04, gcc 9.40 (tcp server)