mtcp-stack / mtcp

mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems
Other
1.98k stars 436 forks source link

WIP: CCP Integration #209

Closed fcangialosi closed 5 years ago

fcangialosi commented 5 years ago

(This PR is intended for iterating on the integration of CCP with mTCP, it is not yet ready to be merged.)

This PR includes a few components:

  1. A new /apps/perf mTCP application that runs a bulk transfer for a given amount of time. One side is written in C with mTCP sockets and the other side is written in Python using traditional sockets so that the receiver need not setup mTCP / dpdk. Details about this app and how to run it are in apps/perf/README.md.
  2. A few additional features necessary before CCP could be integrated
    • Send rate limiting (implemented both pacing and token bucket)
    • SACK receiver support (i.e. decoding SACK information from header and maintaining SACK table)
    • A hash map from socket ID to socket object (because CCP identifies flows by sid)
  3. A few tweaks that were necessary to get reasonable behavior with CCP (these should be investigated further before being merged)
    • Fixed TCP fast retransmit (resetting cwnd after exiting fast retransmit)
    • Stopped doing "go back n" when a loss occurs and just waited for the receiver to catch up
    • Add pacing to mitigate burstiness
  4. CCP Integration itself. (87d84931) All of this code is inside a USE_CCP constant so that it can be easily turned on or off at compile time.

Each of the individual changes are in a separate commit, so they should be pretty easy to isolate and understand. Where necessary I've added more details in the commit messages themselves.

Below I'll describe a bit about CCP and specifically how the integration is implemented.

CCP Architecture Overview

In the CCP architecture, rather than implementing congestion control entirely within the datapath, we split the implementation between the datapath (in this case, the mTCP transport code) and a separate user-space agent. The datapath component is restricted to a simple LISP-like language and is primarily used for collecting statistics and dictating sending behavior, while the user space component can be arbitrarily complex and is written in Rust or Python. Thus, a congestion control algorithm in CCP is actually two programs that work in tandem and communicate asynchronously via IPC.

Serialization and deserialization of messages between CCP and the datapath and the datapath language execution engine are implemented in a C library we provide called libccp, so integrating with CCP is pretty simple. At a high-level, the datapath has 3 responsibilities:

  1. Create an IPC mechanism for communicating with the CCP agent and listen for messages on it. When messages are received, call libccp.ccp_read_msg(msg).
  2. Implement the following 7 simple functions and pass these function pointers to libccp.ccp_init on startup:
    • set_cwnd, set_rate_abs, and set_rate_rel allow libccp to modify sending behavior for a given connection.
    • send_msg allows libccp to send a message using the datapath's IPC mechanism.
    • now, since_usecs, and after_usecs provide libccp with a notion of time, used for implementing timers.
  3. On each ACK, update a measurement data structure and then call libccp.ccp_invoke

Integration Implementation Details

In core.c:MTCPRunThread, we call ccp.c:setup_ccp_connetion to create the IPC socket (for now, I've used unix sockets, though we are also working on a shared memory IPC interface), and then we create a new thread for listening on this socket (core.c:CCPRecvLoopThread) and calling ccp_read_msg when it receives a message.

In tcp_stream.c, we call ccp.c:ccp_create to indicate that a new connection has started.

In tcp_in.c, on each ACK we call ccp.c:ccp_cong_control and pass it information about that ACK. Internally, this updates a measurement structure with a bunch of information from the cur_stream struct and then calls ccp_invoke which runs the datapath program. We also call ccp.c:ccp_record_event when a triple duplicate ACK is dedicated, so that we can notify the user space agent about loss.

Finally, in timer.c, when a timeout is detected, we call ccp.c:ccp_record_event so that we can also notify the user space agent about timeouts.

Building

I've updated tcp/src/Makefile.in and the apps/example/Makefile.in to reflect this, but I'm not too experienced with automake, so this might need some tweaking.

Testing CCP Integration

First, you'll need to install Rust. The easiest way is to use rustup.rs:

curl https://sh.rustup.rs -sSf | sh -- -y -v --default-toolchain nightly

Next, you'll need to build a CCP algorithm. Our generic-cong-avoid package implements standard TCP Reno and Cubic, so this is probably best to start with. The same steps can be followed to build any of the other algorithms we've hosted in the ccp-project organization, such as bbr.

git clone https://github.com/ccp-project/generic-cong-avoid.git
cd generic-cong-avoid
cargo +nightly build

You'll also need to build the perf application:

cd mtcp/apps/perf && make

Unfortunately, at the moment, CCP algorithms currently expect that the transport layer is already running when the startup, so at the moment it is necessary to start the mTCP application and have it wait, then start CCP, and then start traffic. However, this problem will disappear once CCP is running as a thread within mTCP.

I've been using the following three steps to run tests:

  1. Start perf sender (this will wait until you start the receiver, who initiates the connection)
cd mtcp/apps/perf
sudo env LD_LIBRARY_PATH=[absolute-path-to-mtcp/src/libccp] ./client wait [port] 30
  1. Start CCP (can replace Reno with cubic)
cd generic-cong-avoid
sudo ./target/debug/reno --ipc unix
  1. Start the perf receiver on another machine, which will start a 30-second bulk transfer
cd mtcp/apps/perf
sudo python recv.py send [sender-ip] [port]
eunyoung14 commented 5 years ago

Thanks Frank! I have merged your PR, but disabled CCP by default at this time due to some issues regarding larger concurrency. I'm going to make our configure script to support enabling flags for CCP.