bazelbuild / reclient

Apache License 2.0
55 stars 12 forks source link

Remote Execution Client

This repository contains a client implementation of Remote Execution API that works with Remote Execution API SDKs.

Reclient integrates with an existing build system to enable remote execution and caching of build actions.

When used with Server implementation of Remote Execution API, it helps to reduce the build times by applying 2 main techniques:

  1. Distribution of the load by executing individual build actions in parallel on separate remote workers instead of on one build machine so that the build actions that are executed in parallel don’t compete for the same local resources.
  2. RE Server instance-wide cache for build actions, inputs, and artifacts As a consequence, results of a build action that was already executed for exactly the same inputs on the same instance of RE Server will be fetched from the cache even if the action was never executed on the machine.

Most clients are expected to see the performance improvement of their builds after migrating from local to remote builds. However, builds with a high number of deterministic build actions that can be executed in parallel are expected to see the greatest improvement.

Reclient consists of the following main binaries:

  1. rewrapper - a wrapper that forwards build commands to RBE
  2. reproxy - a process that should be started at the beginning of the build and shut down at the end. It communicates with RBE to execute build actions remotely and/or fetch build artifacts from RE Server's CAS (Content Addressable Storage).
  3. bootstrap - starts and stops reproxy, and aggregates the metrics during the shutdown.
  4. scandeps_server - a standalone process for scanning includes of C(++) compile actions. Started and stopped automatically by reproxy.

Note

This is not an officially supported Google product.

Prerequisites

Building

re-client currently builds and is supported on Linux / Mac / Windows.

Once you've installed Bazel, and are in the re-client repo:

Build the code

To build a complete set of binaries for reclient with a clangscandeps deps scanner:

$ bazelisk build --config=clangscandeps //:artifacts_tar
[...]
Target //:artifacts_tar up-to-date:
  bazel-bin/artifacts.tar

To build a complete set of binaries for reclient with a goma deps scanner:

$ bazelisk build --config=goma //:artifacts_tar
[...]
Target //:artifacts_tar up-to-date:
  bazel-bin/artifacts.tar

Install binaries (linux and mac only)

To install all binaries to a $BINDIR

$ bazelisk run --config=goma //:artifacts_install -- --destdir $BINDIR
[...]
INFO: Running command line: bazel-bin/artifacts_install --destdir $BINDIR

Run unit tests

$ bazelisk test //pkg/... //internal/...
[...]
INFO: Elapsed time: 77.166s, Critical Path: 30.24s
INFO: 472 processes: 472 linux-sandbox.
INFO: Build completed successfully, 504 total actions
//internal/pkg/cli:go_default_test                                       PASSED in 0.2s
//internal/pkg/deps:go_default_test                                      PASSED in 1.2s
//internal/pkg/inputprocessor/action/cppcompile:go_default_test          PASSED in 0.1s
//internal/pkg/inputprocessor/flagsparser:go_default_test                PASSED in 0.1s
//internal/pkg/inputprocessor/pathtranslator:go_default_test             PASSED in 0.1s
//internal/pkg/inputprocessor/toolchain:go_default_test                  PASSED in 0.2s
//internal/pkg/labels:go_default_test                                    PASSED in 0.1s
//internal/pkg/logger:go_default_test                                    PASSED in 0.2s
//internal/pkg/rbeflag:go_default_test                                   PASSED in 0.1s
//internal/pkg/reproxy:go_default_test                                   PASSED in 15.5s
//internal/pkg/rewrapper:go_default_test                                 PASSED in 0.2s
//internal/pkg/stats:go_default_test                                     PASSED in 0.1s
//pkg/cache:go_default_test                                              PASSED in 0.2s
//pkg/cache/singleflightcache:go_default_test                            PASSED in 0.1s
//pkg/filemetadata:go_default_test                                       PASSED in 2.1s
//pkg/inputprocessor:go_default_test                                     PASSED in 0.2s

Executed 16 out of 16 tests: 16 tests pass.

Reclient can be built to use Goma's input processor. Goma's input processor is 3x faster than clang-scan-deps for a typical compile action in Chrome. Build as follows:

bazelisk build //:artifacts_tar --config=goma

Versioning

There are four binaries that are built from this repository and used with Android Platform for build acceleration:

These binaries must be stamped with an appropriate version number before they are dropped into Android source for consumption.

Versioning Guidelines

  1. We will maintain a consistent version across all of the binaries. That means, when there are changes to only one of the binaries, we will increment the version number for all of them.

  2. In order to be consistent with Semantic versioning scheme, the version format is of the form “X.Y.Z.SHA” denoting “MAJOR.MINOR.PATCH.GIT_SHA”.

  3. Updating version numbers:

    MAJOR

    • Declare major version “1” when re-client is feature complete for caching and remote-execution capabilities.
    • Update major version post “1”, when there are breaking changes to interface / behavior of rewrapper tooling. Some examples of this are: changing any of the flag names passed to rewrapper, changing the name of rewrapper binary.

    MINOR - Update minor version when

    • New features are introduced in a backward compatible way. For example, when remote-execution capability is introduced.
    • Major implementation changes without changes to behavior / interface. For example, if the “.deps” file is changed to JSON format.

    PATCH - Update patch version

    • For all other bug fixes only. Feature additions (irrespective of how insignificant they are) should result in a MINOR version change.
    • Any new release to Android Platform of re-client tools should update the PATCH version at minimum.
  4. Release Frequency:

    • Kokoro release workflows can be triggered as often as necessary to generate new release artifacts.

How to update version numbers?

You can update the MAJOR/MINOR/PATCH version numbers by simply changing the version.bzl file present in the root of this repository.

Reclient releases

Reclient binaries are released into the CIPD (Chrome Infrastructure Package Deployment) with separate packages for Linux, Mac (amd64 and arm64), and Windows. Whenever a new version of Reclient is released there are 2 sets of binaries released for each of the platforms. Those binaries use 2 different include scanners for C++ build actions: clang-scan-deps and goma. The binaries using the goma include scanner have a version number ending with “-gomaip” suffix, the ones using clang-scan-deps don’t have the suffix. Clients migrating from Goma should use the releases using goma include scanner (with -gomaip suffix).

Downloading Reclient binaries

Reclient binaries can be downloaded using CIPD's Web UI, with a CLI client, or using gclient's configuration.

Downloading binaries with CIPD CLI client

To download Reclient with GomaIP dependency scanner (used for building Chromium):

echo 'infra/rbe/client/${platform}' $RECLIENT_VERSION > /tmp/reclient.ensure
cipd ensure --root $CHECKOUT_DIR --ensure-file /tmp/reclient.ensure

To use Reclient with Clangscandeps (used for Android builds) instead, add -csd suffix to CIPD package:

echo 'infra/rbe/client/${platform}-csd' $RECLIENT_VERSION > /tmp/reclient.ensure

Downloading binaries with gclient

You can configure gclient to download Reclient binaries during the gclient sync phase. Gclient expects a DEPS file in the repository’s root directory. The file contains components that will be checked out during the sync phase. To check out Reclient, the file should have a similar entry to:

vars = {
    ...
    'reclient_version': '<version>',
    ...
}

deps = {
      ...
'<checkout-directory>': {
    'packages': [
      {
        'package': 'infra/rbe/client/${{platform}}',
        'version': Var('reclient_version'),
      }
    ],
    'dep_type': 'cipd',
  },
}

This will instruct gclient to check out <version> of Reclient from /infra/rbe/client/<platform> CIPD package into <checkout-directory> (example). Extracting a version to a variable (as in an example above) is optional, but provides a benefit of being able to override the default value through gclient’s custom variables.

Note: The snippet above will instruct gclient to download Reclient with GomaIP dependency processor. If you prefer Reclient with Clangscandeps, you'd need to set package to infra/rbe/client/${{platform}}-csd.

Using Reclient

Starting and stopping reproxy

Reclient requires reproxy to be started at the beginning of the build, and stopped at the end. This is done through bootstrap binary by executing following commands:

Start:

bootstrap -re_proxy=$reproxy_location [-cfg=$reproxy_config_location]

Stop:

bootstrap -re_proxy=$reproxy_location -shutdown

Configuration

Each of Reclient’s binaries can be configured either by command line flags, environment variables, config files, or by combination of either of those (some flags provided in the command line while others in the config file or set as environment variables). If the same flag is defined in the command line and in the config file or as an environment variable, the order of precedence is following (from lowest to highest priority):

  1. Config file
  2. Environment variable
  3. Command line argument

To use a configuration file, specify it with the -cfg=$config_file_location flag. The config file is a list of flag_name=flag_value pairs, each on a new line. Example below:

service=$RE_SERVER_ADDRESS
instance=$RE_SERVER_INSTANCE
server_address=unix:///tmp/reproxy.sock
log_dir=/tmp
output_dir=/tmp
proxy_log_dir=/tmp
depsscanner_address=$scandeps_server_location #distributed with Reclient
use_gce_credentials=true

To configure Reclient with environment variables, the variables should be prefixed with RBE_ (e.g. the value of RBE_service environment variable is used to set the service flag).

Rewrapper

Full list of rewrapper config flags can be found in docs/cmd-line-flags.md. A few of the most commonly used flags are:

If you are experiencing sporadic timeouts when dialing reproxy, you might consider adding:

Reproxy

Full list of reproxy flags can be found docs/cmd-line-flags.md. A few of the most commonly used flags are:

Authentication flags

If your RE Server implementation does not use RPC authentication then use one of:

If your RE Server uses RPC authentication then use one of the following flags:

Auxiliary Metadata flag

If you want to collect backend workers' auxiliary metadata (cpu, memory usage per action), you can generate a .pb (or .proto.bin) file contains the descriptor information that will be used by reproxy at runtime to decode the auxiliary metadata, which is a proto message in the type of google.protobuf.Any.

Once you have customized auxiliary_metadata.proto file per your backend worker's specification, compile it as a .pb or .proto.bin file with protoc, and pass the file path to reproxy via --auxiliary_metadata_path flag, or environment variable RBE_auxiliary_metadata_path. Then, at runtime, reporxy will use this file to parse your backend worker's auxiliary metadata and log the data into reporxy logs.

cd api/auxiliary_metadata # or where you have the cusotmized `.proto` file

protoc \
--proto_path=. \
--descriptor_set_out=auxiliary_metadata.pb \
auxiliary_metadata.proto

export RBE_auxiliary_metadata_path=~/Workspace/re-client/api/auxiliary_metadata/auxiliary_metadata.pb

# then continue with your regular build with reproxy

or

cd api/auxiliary_metadata # or where you have the cusotmized `.proto` file

protoc \
--proto_path=. \
--descriptor_set_out=auxiliary_metadata.proto.bin \
auxiliary_metadata.proto

export RBE_auxiliary_metadata_path=~/Workspace/re-client/api/auxiliary_metadata/auxiliary_metadata.proto.bin

# then continue with your regular build with reproxy

It's worth noting that the backend can give this proto message any arbitrary name; however, the client side proto message should strictly use AuxiliaryMetadata to receive it. For example, in the unit test, the backend send the proto msg out with name WorkerAuxiliaryMetadata, and client receives it as AuxiliaryMetadata.

Integration with the build system

To execute your build actions remotely through Reclient, the build command should be prepended with:

 $rewrapper [-cfg=$config-file] -exec_root=$checkout-dir --

, where:

When rewrapper is executed, it passes the build command to a running instance of reproxy that:

Before the build is executed, reproxy needs to be started by bootstrap and shut down at the end of the build.

During the run, reproxy writes its application-level logs to a directory specified by a log_dir flag and logs records about the executed build actions to an RPL file in a directory specified by the proxy_log_dir flag. During reproxy shutdown, bootstrap dumps Reclient related build metrics to rbe_metrics.txt file saved at a location specified by the bootstrap's output_path flag.

GN integration

GN is a meta-build system that generates build files for Ninja. Its configuration files are written in a simple, dynamically typed language. Reclient can be integrated with the build by modifying the GN config files. Because of GN's language flexibility the method of how Reclient should be integrated will depend on the project, but usually it should involve adding a rewrapper prefix (example) that's controlled by a gn argument (example), and starting and stopping reproxy before and after the build. The latter might be done by a helper script with reproxy start and stop steps around the ninja call example.

CMake integration

You can integrate CMake with Reclient by using <LANG>_COMPILER_LAUNCHER property. This property is initialized by the value of the CMAKE_<LANG>_COMPILER_LAUNCHER variable if it is set when a target is created. For instance, to use Reclient for c/c++ compile actions, you’d need to set both CMAKE_C_COMPILER_LAUNCHER and CMAKE_CXX_COMPILER_LAUNCHER to $rewrapper;-cfg=$config-file;-exec_root=$checkout-dir (the property accepts semicolon separated list as a launcher command).

Please note that CMake operates on absolute paths and you need to ensure that RE server executes the action on a remote worker in the same directory as it is in a local build machine (the method depends on your RE Server implementation). Moreover, please be aware that rewrapper's canonicalize_working_dir flag tampers the build actions' inputs paths, and thus should be disabled for the build actions generated by CMake.

Example of CMake build integration with Reclient can be found here.