High load seems to overload stream queue

ChrisWint commented 6 years ago

Hi,

while trying to integrate my application with mTCP I ran into issues when sending many small packages from a TCP client. Ater a certain load is reached the connection on the client is lost with Connection reset by peer and the server logs

[StreamEnqueue: 194] Exceed capacity of stream queue!
[Handle_TCP_ST_SYN_RCVD: 814] Stream 100: Failed to enqueue to the listen backlog!

Now this is far below any throughput where I would expect this to happen (200 packets/s without epoll, 5000 packets/s with epoll). The application run with a default TCP connection is able to handle at least 40000 packets/s. Furthermore running on a lower throughput for a longer time eventually runs into the same errors.

After these errors first occured even 1 packet/s connections will reproduce the error messages for some packets, it seems like the queue is not dequeued/cleared properly.

If it is relevant, here is my mtcp config

Configurations:
Number of CPU cores available: 1
Number of CPU cores to use: 1
Maximum number of concurrency per core: 10000
Maximum number of preallocated buffers per core: 10000
Receive buffer size: 8192
Send buffer size: 8192
TCP timeout seconds: 30
TCP timewait seconds: 0

eunyoung14 commented 6 years ago

Could you also try to use larger backlog numbers passed to mtcp_listen(), possibly like 4096? Currently it looks like the accept queue is full, which stores tcp streams that are established, but not yet accepted by the application. mTCP might need larger backlogs since it processes packets in a batch to minimize I/O and context switching overheads.

ChrisWint commented 6 years ago

Hi, thanks for the quick reply

sadly that does not solve the problem, I was already running it with a backlog of 16384. The only thing that improves the situation a bit is reducing TCP timeout, it seems as if the connections aren't closed properly. But I can't find anything to cause that in my code. I included the relvant parts below. I furthermore reduced the packet processing to a minimal xor to ensure the server is not taking too long to handle each packet and thereby causing the problem.

Is there any way to check/log what happens to the packets in the backlog?

Code: I removed not relevant error handling after each operation (socket check < 0, etc.) as these are not triggered when running the code.

Server code

    int listener_socket_descriptor = 0, connection_socket_descriptor = 0;
    char recv_buff[buffsize];

    unsigned max_fds = 10000 * 3;
    int core_limit = 1;
    struct mtcp_conf mcfg;
    mtcp_getconf(&mcfg);
    mcfg.num_cores = core_limit;
    mtcp_setconf(&mcfg);

    int ret = mtcp_init("mtcp_server.conf");

    mtcp_getconf(&mcfg);
    mcfg.max_concurrency = max_fds;
    mcfg.max_num_buffers = max_fds;
    mtcp_setconf(&mcfg);

    unsigned core = 0;
    mtcp_core_affinitize(core);
    mctx_t mctx = NULL;
    mctx = mtcp_create_context(core);

    listener_socket_descriptor = mtcp_socket(mctx, AF_INET, SOCK_STREAM, 0);

    mtcp_setsockopt(mctx, listener_socket_descriptor, SOL_SOCKET, SO_REUSEADDR, new int(1), sizeof(int));
    mtcp_bind(mctx, listener_socket_descriptor, reinterpret_cast<struct sockaddr*>(&server_address), sizeof(server_address));
    mtcp_listen(mctx, listener_socket_descriptor, 16384);

    // listen to connecting clients
    while(_is_running) {
        connection_socket_descriptor = mtcp_accept(mctx, listener_socket_descriptor, (struct sockaddr*)NULL, NULL);

        if (_is_running) {
            mtcp_read(mctx, connection_socket_descriptor, recv_buff, sizeof(recv_buff)-1);
            recv_buff[0] = recv_buff[0] ^ 1; //Symbolic for result processing
        }
        mtcp_close(mctx, connection_socket_descriptor);
    }

Client Code

    int socket_descriptor = 0;

    socket_desc = socket(AF_INET, SOCK_STREAM, 0)) < 0);
    connect(socket_descriptor, (struct sockaddr *)&socket_address, sizeof(socket_address)));
    write(socket_desc, _send_buf, sizeof(_send_buf));
    close(socket_desc);

eunyoung14 commented 6 years ago

Hi,

Do you also have event-driven implementation of the application for mTCP? mTCP currently does not support blocking calls for its API. (except for mtcp_epoll_wait()) The blocking features were implemented at first, but are not maintained any more. It seems the app could accept flows at first, but fell a sleep forever after some point when there was nothing to accept temporarily. We should have warned users about this.

The sockets for mTCP should be used event-driven and nonblocking way. You may find some examples in our Wiki and sample applications.

ChrisWint commented 6 years ago

Ok, thanks. That explains the error in the non-epoll implementation, but does not explain why epoll is failing under a huge load. The problem with epoll seems to be of a different nature, here the client fails to open new connections as the server is not closing the connections properly (connections in TIME_WAIT until timeout), neither when closed server side nor client side, nor both. This eventually blocks all free ports, thus no new connections can be opened. I ported the same code to epoll with standard tcp, there the problem is not showing, you can find the mtcp epoll code below. This leads me to believe that this has to be somewhat an issue of mtcp, as the same tcp implementation works fine.

Epoll server, client identical to above

    listener_socket_descriptor = mtcp_socket(mctx, AF_INET, SOCK_STREAM, 0);

    mtcp_setsockopt(mctx, listener_socket_descriptor, SOL_SOCKET, SO_REUSEADDR, new int(1), sizeof(int));

    mtcp_bind(mctx, listener_socket_descriptor, reinterpret_cast<struct sockaddr*>(&server_address), sizeof(server_address));

    mtcp_listen(mctx, listener_socket_descriptor, MAX_EVENTS);

    struct mtcp_epoll_event ev, events[MAX_EVENTS];
    int epollfd = mtcp_epoll_create(mctx, MAX_EVENTS);

    ev.events = EPOLLIN;
    ev.data.sockid = listener_socket_descriptor;

    mtcp_epoll_ctl(mctx, epollfd, EPOLL_CTL_ADD, listener_socket_descriptor, &ev);

    int n_fds;

    while(_is_running) {

      n_fds = mtcp_epoll_wait(mctx, epollfd, events, MAX_EVENTS, -1);
      for( int curr_event = 0; curr_event < n_fds; ++curr_event) {
          if( events[curr_event].data.sockid == listener_socket_descriptor ){
              int currentClientFd = mtcp_accept(mctx, listener_socket_descriptor, NULL, NULL);
              if(currentClientFd < 0){
                  LOG_ERROR("ERROR ON ACCEPT");
                  cleanupAndExit(mctx, listener_socket_descriptor);
              }
              mtcp_setsock_nonblock(mctx, currentClientFd);

              ev.events = EPOLLIN;
              ev.data.sockid = currentClientFd;

              mtcp_epoll_ctl(mctx, epollfd, EPOLL_CTL_ADD, currentClientFd, &ev);
        } else {

              int r = mtcp_read(mctx, events[curr_event].data.sockid, recv_buff, sizeof(recv_buff) - 1);
              if(r == 0){
                mtcp_close(mctx, events[curr_event].data.sockid);
              }
              else {
                  //Tried closing here as well, did not make a difference
                  //mtcp_close(mctx, events[curr_event].data.sockid);
                  recv_buff[0] = recv_buff[0] ^ 142; //symbolic result processing
              }
          }

     }

   }

eunyoung14 commented 6 years ago

Then, do you see new connections being accepted after the timeout set in the configuration? I mean the tcp_timeout option. I think the server should be able to accept connections even though there are still some connections that aren't closed properly. If the number of concurrent connections does not reach max_concurrency, it should still work. (mTCP will show errors if the concurrency reaches the maximum.)

Regarding your application, could you also try to make the listening socket nonblocking?

listener_socket_descriptor = mtcp_socket(mctx, AF_INET, SOCK_STREAM, 0);
mtcp_setsock_nonblock(mctx, listener_socket_descriptor);
...

Thanks

ChrisWint commented 6 years ago

Hey, the problems that arise from this are not really serverside issue, the server per se is working fine it is simply not closing connections properly. The problem with the open connections is that they are also still open on the client, thus blocking ports there. If i want to open 65000 connections in a minute (somewhat reasonable in sensor data reporting) the client blocks because he can't find a free port to bind to. But as this happens only with mTCP server implementation, not with TCP, I figured there has to be a problem in mTCP connection handling, more specific in the connection teardown. The connection sockets are closed on both sides, thus there is no reason for mTCP to leave the connection open.

I will however try setting the socket nonblocking, but I can't see how this would help close (send the ack/fin messages serverside) the connection.

ChrisWint commented 6 years ago

Okay I debuged the packages in detail and it seems to be no fault in mTCP, my bad. In the TCP-TCP connection the connections gets shut down with the RST from the server, e.g.

Client             Server
FIN,ACK   ->
                 <-  RST,ACK

while mTCP uses the proper teardown. I will rewrite my application to reuse connections to avoid this problem.

Thank you so much for your help!

eunyoung14 commented 6 years ago

You're welcome. We also had a function, mtcp_abort(), immediately closing the connection with RST for benchmarking purposes, but it isn't maintained any more. If you are interested, please look at mtcp/src/api.c and mtcp/src/include/mtcp_api.h. I cannot guarantee that it will work, but the implementation still exists in api.c. You can simply write the definition for mtcp_abort on mtcp_api.h to bring it back.

I'll close this issue since it has been resolved. Please feel free to open this again or create a new issue if you have further questions. Thanks.

mtcp-stack / mtcp

High load seems to overload stream queue #200