Flow-IPC / ipc

[Start here!] Flow-IPC - Modern C++ toolkit for high-speed inter-process communication (IPC)
https://flow-ipc.github.io/
Apache License 2.0
255 stars 10 forks source link
async capnp capnproto cplusplus documentation flow-ipc functional-tests generic-programming ipc jemalloc malloc-library message-queue mq session-management sessions shared-memory shm unit-tests unix-domain-sockets zero-copy

Flow-IPC: Modern C++ toolkit for fast inter-process communication (IPC)

In this context, IPC means the sharing or transmission of a data structure from one process to another. In C++ systems programing, this is a common activity with significant impact on system performance. E.g., it is used heavily in microservices.

In serious C++ applications, high-performance IPC code tends to be difficult to develop and reuse, and the most obvious and effective technique to combat latency -- avoiding copying -- further increases the difficulty and decreases reusability by an order of magnitude.

This project -- Flow-IPC -- enables C++ code for IPC that is both performant and easy to develop/reuse, with no trade-off between the two.

Flow-IPC is for C++17 (or higher) programs built for Linux that run on x86-64 processors. (Support for macOS/BSD and ARM64 is planned as an incremental task. Adding networked IPC is also a natural next step, depending on demand.)

Documentation

The guided Manual explains how to use Flow-IPC. A comprehensive Reference is inter-linked with that Manual.

The project web site contains links to documentation for each individual release as well.

Please see below, in this README, for a Primer as to the specifics of Flow-IPC.

Obtaining the source code

Installation

See INSTALL guide.

Contributing

See CONTRIBUTING guide.


Flow-IPC Primer

Background

Flow-IPC focuses on IPC of data structures (and native sockets a/k/a FDs). I.e., the central scenario is: Process P1 has a data structure X, and it wants process P2 to access it (or a copy thereof) ASAP.

The OS and third-parties already avail C++ developers of many tools for/around IPC. Highlights:

Conceptually, the IPC op above is not so different from triggering a function call with argument X in a different thread -- but across process boundaries. Unfortunately, in comparison to triggering F(X) in another thread in-process:

How does Flow-IPC help?

With Flow-IPC, the above IPC op is easy to code, for any form of "X," whether: blobs, FDs, nested STL-compliant containers, C-style structs with pointers, or Cap'n Proto schema-based structured data.

Moreover, it eliminates all copying of X -- which results in the best possible performance. This is called end-to-end zero-copy.

Example: End-to-end zero-copy performance, Cap'n Proto payload

graph: perf_demo capnp-classic versus capnp-Flow-IPC

The graph above is an example of the performance gains you can expect when using Flow-IPC zero-copy transmission, from the included perf_demo tool. (Here we use Cap'n Proto-described data. Native C++ structures have a similar performance profile.) In the graph, we compare the RTTs (latencies) of two techniques, for transmitted payloads of various sizes.

In this example, app 1 is a memory-caching server that has pre-loaded into RAM a few files ranging in size from 100kb to 1Gb. App 2 (client) requests a file of some size. App 1 (server) responds with a single message containing the file's data structured as a sequence of chunks, each accompanied by that chunk's hash:

  # Cap'n Proto schema (.capnp file, generates .h and .c++ source code using capnp compiler tool):

  $Cxx.namespace("perf_demo::schema");
  struct Body
  {
    union
    {
      getCacheReq @0 :GetCacheReq;
      getCacheRsp @1 :GetCacheRsp;
    }
  }

  struct GetCacheReq
  {
    fileName @0 :Text;
  }
  struct GetCacheRsp
  {
    # We simulate the server returning files in multiple equally-sized chunks, each sized at its discretion.
    struct FilePart
    {
      data @0 :Data;
      dataSizeToVerify @1 :UInt64; # Recipient can verify that `data` blob's size is indeed this.
      dataHashToVerify @2 :Hash; # Recipient can hash `data` and verify it is indeed this.
    }
    fileParts @0 :List(FilePart);
  }
  # ...

App 2 receives the GetCacheRsp message and prints the round-trip time (RTT): from just before sending GetCacheReq to just after accessing some of the file data (e.g. rsp_root.getFileParts()[0].getHashToVerify() to check the first hash). This RTT is the IPC-induced latency: roughly speaking the time penalty compared to having a monolithic (1-process) application (instead of the split into app 1 and app 2).

Observations (tested using decent server-grade hardware):

The code for this, when using Flow-IPC, is straighforward. Here's how it might look on the client side:

  // Specify that we *do* want zero-copy behavior, by merely choosing our backing-session type.
  // In other words, setting this alias says, “be fast about Cap’n Proto things.”
  // (Different (subsequent) capnp-serialization-backing and SHM-related behaviors are available;
  // just change this alias’s value. E.g., omit `::shm::classic` to disable SHM entirely; or
  // specify `::shm::arena_lend::jemalloc` to employ jemalloc-based SHM. Subsequent code remains
  // the same! This demonstrates a key design tenet of Flow-IPC.)
  using Session = ipc::session::shm::classic::Client_session<...>;

  // IPC app universe: simple structs naming and describing the 2 apps involved.
  //   - Name the apps, so client knows where to find server, and server knows who can connect to it.
  //   - Specify certain items -- binary location, user/group -- will be cross-checked with the OS for safety.
  //   - Specify a safety/permissions policy, so that internally permissions are set as restrictively as possible,
  //     but not more.
  // The applications should share this code (so the same statement should execute in the server app also).
  const ipc::session::Client_app CLI_APP
    { "cacheCli",                                     // Name.
      "/usr/bin/cache_client.exec", CLI_UID, GID };   // Safety details.
  const ipc::session::Server_app SRV_APP
    { { "cacheSrv", "/usr/bin/cache_server.exec", SRV_UID, GID },
      { CLI_APP.m_name },                             // Which apps may connect?  cacheCli may.
      "",                                             // (Optional path override; disregard.)
      ipc::util::Permissions_level::S_GROUP_ACCESS }; // Safety/permissions selector.
  // ...

  // Open session e.g. near start of program.  A session is the communication context between the processes
  // engaging in IPC.  (You can create communication channels at will from the `session` object.  No more naming!)
  Session session{ CLI_APP, SRV_APP, on_session_closed_func };
  // Ask for 1 communication *channel* to be available on both sides from the very start of the session.
  Session::Channels ipc_raw_channels(1);
  session.sync_connect(session.mdt_builder(), &ipc_raw_channels); // Instantly open session -- and the 1 channel.
  auto& ipc_raw_channel = ipc_raw_channels[0];
  // (Can also instantly open more channel(s) anytime: `session.open_channel(&channel)`.)

  // ipc_raw_channel is a raw (unstructured) channel for blobs (and/or FDs).  We want to speak capnp over it,
  // so we upgrade it to a struc::Channel -- note the capnp-generated `perf_demo::schema::Body` class, as
  // earlier declared in the .capnp schema.
  Session::Structured_channel<perf_demo::schema::Body>
    ipc_channel{ nullptr, std::move(ipc_raw_channel), // "Eat" the raw channel object.
                 ipc::transport::struc::Channel_base::S_SERIALIZE_VIA_SESSION_SHM, &session };
  // Ready to exchange capnp messages via ipc_channel.

  // ...

  // Issue request and process response.  TIMING FOR ABOVE GRAPH STARTS HERE -->
  auto req_msg = ipc_channel.create_msg();
  req_msg.body_root()
    ->initGetCacheReq().setFileName("huge-file.bin"); // Vanilla capnp code: call Cap'n Proto-generated-API: mutators.
  const auto rsp_msg = ipc_channel.sync_request(req_msg); // Send message; get ~instant reply.
  const auto rsp_root = rsp_msg->body_root().getGetCacheRsp(); // More vanilla capnp work: accessors.
  // <-- TIMING FOR ABOVE GRAPH STOPS HERE.
  // ...
  verify_hash(rsp_root, some_file_chunk_idx);

  // ...

  // More vanilla Cap'n Proto accessor code.
  void verify_hash(const cache_demo::schema::GetCacheRsp::Reader& rsp_root, size_t idx)
  {
    const auto file_part = rsp_root.getFileParts()[idx];
    if (file_part.getHashToVerify() != compute_hash(file_part.getData()))
    {
      throw Bad_hash_exception(...);
    }
  }

In comparison, without Flow-IPC: To achieve the same thing, with end-to-end zero-copy performance, a large amount of difficult code would be required, including management of SHM segments whose names and cleanup have to be coordinated between the 2 applications. Even without zero-copy -- i.e., simply ::write()ing a copy of the capnp serialization of req_msg to and ::read()ing rsp_msg from a Unix domain socket FD -- sufficiently robust code would be non-trivial to write in comparison.

The preceding example was chosen for 2 reasons:

So Flow-IPC is for transmitting Cap'n Proto messages?

Yes... but only among other things!

Flow-IPC provides API entry points at every layer of operation. Flow-IPC is not designed as merely a "black box" of capabilities. E.g., for advanced users:


What's next?

If the example and/or promises above have piqued your interest:

A little complete example transmits (with zero-copy) a structured message containing the string Hello, world! and the number 42.

In the Manual, the API Overview / Synopsis summarizes (with code snippets) what is available in Flow-IPC.

These diagrams from the Manual might also be helpful in showing what's available and/or some of what is going on underneath.


Here's a bird's eye view of Flow-IPC (left) and a compact exploration of a single Flow-IPC channel (right). (Click/tap to zoom.)

graph: left-sessions/channels/arena right-channel capabilities


The following diagram delves deeper, roughly speaking introducing the core layer of ipc::transport.

graph: IPC channels (core layer); SHM arenas; and your code