fpagliughi / sockpp

Modern C++ socket library.
BSD 3-Clause "New" or "Revised" License
769 stars 126 forks source link

Non-blocking accept #88

Open gotnone opened 10 months ago

gotnone commented 10 months ago

Are there any plans to add a non-blocking accept() for the tcp_acceptor class?

I attempted to add this to my own derived class, but I think it may be useful to add this to the main tcp_acceptor class.

namespace sockpp {

struct tcp_acceptor_nb : tcp_acceptor {
public:
  template<typename ...Args>
  tcp_acceptor_nb(Args... args) : tcp_acceptor(std::forward<Args...>(args)...) {}
  stream_socket accept(sock_address *clientAddr = nullptr, std::chrono::microseconds timeout);
};
} // namespace sockpp

I based this off of the non-blocking connect() code, but I am far from an expert at socket programming, there are probably some issues in this implementation.

namespace sockpp {

stream_socket tcp_acceptor_nb::accept(sock_address *clientAddr /*=nullptr*/,
                                      std::chrono::microseconds timeout) {
  if (timeout.count() <= 0)
    return accept(clientAddr);

  sockaddr *p = clientAddr ? clientAddr->sockaddr_ptr() : nullptr;
  socklen_t len = clientAddr ? clientAddr->size() : 0;

  // Out new socket is definitely in blocking mode;
  // make it non-blocking to do this
  set_non_blocking(true);
  socket_t s = check_socket(::accept(handle(), p, clientAddr ? &len : nullptr));

  // TODO: Reimplement with poll() for systems with lots of sockets.
  if (s == INVALID_SOCKET) {
    fmt::print(stderr, "Socket error {}: {}", last_error(), last_error_str());
    if (last_error() == EINPROGRESS || last_error() == EWOULDBLOCK) {
      // Non-blocking connect -- call `select` to wait until the timeout:
      // Note:  Windows returns errors in exceptset so check it too, the
      // logic afterwords doesn't change
      fd_set readset;
      FD_ZERO(&readset);
      FD_SET(handle(), &readset);
      // fd_set writeset = readset;
      fd_set exceptset = readset;
      timeval tv = to_timeval(timeout);
      int n = check_ret(
          ::select(int(handle()) + 1, &readset, nullptr, &exceptset, &tv));

      if (n > 0) {
        // Got a socket event, but it might be an error, so check:
        int err{};
        if (get_option(SOL_SOCKET, SO_ERROR, &err))
          clear(err);
        s = check_socket(::accept(handle(), p, clientAddr ? &len : nullptr));
      } else if (n == 0) {
        clear(ETIMEDOUT);
      }
    }

    if (last_error() != 0) {
      // close();
      s = INVALID_SOCKET;
    }
  }
  // Restore the default (blocking) mode for a new socket.
  auto lasterr = last_error();
  set_non_blocking(false);
  clear(lasterr);
  return stream_socket(s);
}
} // namespace sockpp
fpagliughi commented 9 months ago

Apologies. I could have sworn I responded to this a few weeks ago... Perhaps I have a window with an unsent response buried somewhere on my desktop!

My first question would be... What is the purpose of the non-blocking accept?

The purpose for connect() is not really for non-blocking/async operation, but rather to allow the user to specify a reasonable timeout that is independent of the OS default. Most systems have a pretty long default, like 30sec, which is often too long to wait for some apps.

If we were to add this for accept(), we'd need to isolate the blocking/non-blocking/timeout code to its own class (or similar) and make it externally platform agnostic. Already there's a pending commit to replace the problematic, though portable, select() call with something more efficient on some platforms: https://github.com/fpagliughi/sockpp/commit/0974f860fe40ed83248e51b7085dc6fd4765144c

Trying to do that and optimize it for all systems via epoll(), kqueue(), etc, would become extensive and problematic. That work would be worthy of a library all of its own, like Rust did with the mio library. But if we started isolating the code to implement accept(), there would be immediate requests to extend this library to become fully asynchronous. Like #49 .

So I'm worried about the slippery slope, but am curious about the need for accept() without async read/write. Is a timeout desired, or fully non-blocking brhavior?

gotnone commented 9 months ago

I am currently using a timeout in my extended class to permit two things:

1) Implement watchdog behavior to check if an expected incoming connection has not occurred within a specific time Perhaps this could be implemented in a separate thread by checking a shared state variable that the tcp_acceptor thread updates when the accept() is successful. 2) Permit cleanly shutting down the tcp_acceptor thread using a condition variable I could not find a way to shutdown the tcp_acceptor class from another thread. The tcp_acceptor thread would be blocked on the accept(). This may be user error on my part. In the event that it is platform specific, my application is currently used on Windows.

constexpr auto listen_duration = std::chrono::seconds(15);
std::atomic<bool> running{true};  // Modified by main thread when shutdown signal is handled
tcp_acceptor listener{};
inet_address peer{};
/*...*/
while(running) {
  listener.accept(&peer, listen_duration);
  /* do watchdog checking */
}
/* perform clean shutdown */

For my use case, a timeout is sufficient.

I understand your concern about expanding the library to be fully asynchronous. One of the reasons why I selected this library for my application was because it was relatively lightweight. I had initially looked at using an asio based wrapper, but my compile times went through the roof. Since I only need to listen for a single expected incoming TCP connection I didn't want to pay this price for asio.

fpagliughi commented 9 months ago

Those are both pretty reasonable. But there should be a better way, in general, to do both of those things. Especially the second one. Certainly a timeout would work, but it's not the cleanest solution. You really want something that responds immediately and definitively to say "quit now" rather than polling in a loop.

Ironically, one way is to use a poll()/select() call with a handle-based OS synch object, like a semaphore or eventfd in Linux. Signaling the other object would cause the poll() to return immediately and indicate a shutdown is desired. But these OS signal objects aren't portable solutions.

But I wonder if there is a portable way to knock the socket off the accept() call immediately. I wonder what happens if you shutdown() or close() the socket from another thread? I can't remember if I ever tried that.

fpagliughi commented 9 months ago

But long and short... An overloaded accept() with a timeout is probably pretty reasonable either way. It should go with the new poll() solution in the develop branch. But I'm pressed for time on a number of other open-source projects, and probably won't get a release of this library out until the end of this year, at the earliest.

MrApplejuice commented 6 months ago

Hey all!

I see that the overload suggestion was marked to be added in v0.9 but that main/master has "starting the move toward the v2.0 API". I assume that this will then not receive any work until v2.0 is done, is that correct?

I agree with the conclusion but because I never found a portable solution to the "stopping issue" either, I like the polling pattern @gotnone has posted a lot. Also thanks a lot for the C++ bindings in general! It makes managing the C-resource lifetimes from socket.h a breeze!

fpagliughi commented 6 months ago

I'm on the fence. I just wanted to push out the 1.0 release to preserve a snapshot of the API that's been available for the last few years before a bunch of breaking changes were introduced (which will now be v2.0). So I just tossed out v1.0 rather abruptly without adding any new features or merging the PR's that could have shipped. I didn't want to risk any destabilization at the last minute. The v0.8 line seemed pretty stable.

The plan is to maintain v1.x line for a little while... certainly to fix any bad bugs so people can keep using it indefinitely. But I can probably add some more features if there's a demand.

But to be clear... v2.0 is going to be what the v0.9 release was supposed to be. I just renamed everything to be less confusing. I was hoping to get most of it done over the holiday break, and though I made a lot of progress, the TLS support is nowhere near ready. So I may put out a v2.0 with just the new error handling, and hold off on TLS until v2.1.

MrApplejuice commented 6 months ago

Hey, thank you for the update. I was able to spend some time on the issue I was facing that related unix domain sockets and as it turned out I had to use the poll-implementation you suggested before anyway, sort of: unix domain sockets do not seem to respect or implement the timeout argument from what I saw when experimenting. So I implemented the timeout as part of the third argument to poll(), because that design is still easier than writing all the glue code needed to break out from a pending poll() using an artificial file descriptor.

For tcp-sockets this all was way easier in the past I remember...

Anyway very exciting stuff, I know. But it all made me question the design proposed in this myself. As I discovered, poll(..., timeout) might be a better alternative to implement the timeout-based waiting than to set the timeout directly on the socket. That way, the solution works for unix sockets as well... however, all of these functions are Linux/Unix specific and lack an implementation for windows. And then there is the problem that a little bit of extra work using something like memfd_create (???) to "properly" escape from a polled socket-fd is just around the corner...

The Windows-implementation using the Windows overlapped API socket functions (shudder) and something like WaitForMultipleObjects is probably also just around the corner once the basic scaffolding for this stands.

I think I see the pains now to decide on this for a generic library like sockpp. I am trying to finish up my "holidays project" right now, would you be willing to maybe accept some platform specific patches for this library to get the wagon rolling on "escaping a blocking accept"-call? I think I can salvage my Linux code at least and maybe even get something going for Windows?

Relevant `poll()` code block from my code ~~~~~~~~~~~~~~~~.cpp /* code released into public domain */ /* author's note: this function does not do error checking */ void ConnectionHandler :: threadMain() { struct pollfd polldata; polldata.fd = acceptor->handle(); // acceptor is a std::shared_ptr polldata.events = POLLERR | POLLIN | POLLPRI | POLLHUP; while (alive) { polldata.revents = 0; int pollres = poll(&polldata, 1, 500); // 500 ms delay if (polldata.revents || pollres) { if (polldata.revents & POLLHUP) { // Socket connection died, so we escape the thread alive = false; socket.close(); continue; } else if (sockpp::unix_socket sock = acceptor->accept()) { // do something with the shiny new sock } } } } ~~~~~~~~~~~~~~~~