ninja-build / ninja

a small build system with a focus on speed
https://ninja-build.org/
Apache License 2.0
11.3k stars 1.61k forks source link

Add a documentation section about integrating with third-party software #982

Open evmar opened 9 years ago

evmar commented 9 years ago

It would be nice to include info about how to use distcc or ElectricCloud's eMake.

This would serve both as documentation of those tools and also provide links to them for people who use ninja and don't know about them.

flybd5 commented 9 years ago

Accelerating Ninja with Electric Cloud's ElectricAccelerator®

ElectricAccelerator (http://electric-cloud.com/products/electricaccelerator/) is an acceleration platform that optimally parallelizes software tasks across clusters of physical or cloud CPUs. This gives software-driven organizations the ability to speed up any number of concurrent activities so they can deliver better software faster.

ElectricAccelerator offers the following unique capabilities:

screen shot 2015-06-29 at 6 36 32 pm

ElectricAccelerator works on the basis of an agent model. Agents are daemons that run on agent boxes, usually 1.5 x #_of_cores. These agents are managed by a Cluster Manager, which represents the gateway to the machines running the agents, or what is referred to as a cluster. eMake and tools are installed on build machines owned by developers or build managers, and that is how builds are initiated. In a very simplistic explanation, eMake talks to the Cluster Manager and asks for agents for a task. Once it receives a list of the agents and their IP's, it then orchestrates the parallelization of the build.

However, in the case of builds that use Ninja, eMake is not an option because it does not know how to read Ninja build files. Ninja does support parallelization in a manner similar to GMake, with both automatic parallelization based on the number of cores, and a -j option that can be used to override the default value, forcing Ninja to produce N number of processes as necessary to parallelize a build.

That means that if you want to parallelize using just Ninja, and you want blazing fast speed, you have to be prepared to spend blazing big money for large servers with high numbers of cores.

ElectricAccelerator, however, can give you the ability to pile up large numbers of inexpensive boxes with i7 processors, for example, to build a cluster of unlimited size. What it also provides is an alternate way to access the parallelization infrastructure of the cluster without having to use the eMake tool.

The key to this feature is a tool called Electrify which is included at no extra charge. This tool can run an executable and monitor it as it spawns processes. These processes can then be intercepted and "sent" to the build cluster for parallel execution.

Now, this sounds simple on its face, but the requirements for successful parallelization are strict and can present challenges. The processes themselves must be just that, processes that are distinct and can run on their own without having to operate under some underlying runtime. This requirement excludes Java threads, because they cannot be separated from their underlying JRE without breaking the thread.

The processes also must not have dependencies between each other that would cause them to step over each other during execution, such as sharing files in read/write mode.

In addition. parallelization itself benefits from tasks that can be broken up into small pieces that can run simultaneously. Some large tasks, such as linking libraries or executables, creating .tar or .deb packages or other similar tasks cannot be broken up, so they do not contribute to the performance of the build as a whole if they are parallelized. All they do is add the overhead of sending the task to the agent, creating network traffic, etc. and as such are better off running on the machine from which the build was initiated. Those are referred to as local tasks, and in ElectricMake they are designated as such with a #pragma runlocal statement in a Makefile. More on how Electrify can keep those running on the build machine in a minute.

Because ElectricCloud has customers who like Ninja and don't want to switch to Makefiles, but who would also like to take advantage of our parallelization tool, I decided to see if I could parallelize Ninja builds with Electrify. The short answer is that it works, and it works very, very well.

The longer answer is that it takes some prep to be able to do this. This is a rough list of what needs to be done and it assumes you already have a cluster of agents and a Cluster Manager installed. If you are using the peer-to-peer ElectricAccelerator Huddle (http://electric-cloud.com/products/electricaccelerator/huddle) it will work as well, but of course it will be limited to the number of cores available in your peer-to-peer networked setup.

  1. Install ElectricAccelerator tools on the machine where you are currently running builds.
  2. Install whatever dependencies your builds needs (such as shared libraries) in the machines where the agents reside. For the Google Chrome build, for example, this is done with the build/install-build-deps.sh script. If you don't do this, Electrify will still send those to the agent when it needs them, but it will be more work for you and more work for the network and agents.
  3. You will need to identify what ElectricAccelerator refers to as the "eMake root" of the build. If you are using eMake already you know what this is -- one or more directories from which eMake will identify files that the agents need to complete an assigned task. In the case of the Google Chrome build, it's the ./chromium directory.
  4. My recommendation is to build a script with the command line you will use to launch the parallelized build. The command line I use to launch the Google Chrome build is this:

electrify --emake-annofile=build-@ECLOUD_BUILD_ID@.anno --emake-annodetail=basic --emake-annoupload=1 --emake-root=<home_dir_where_chrome_build_lives> --emake-cm=<ip_of_your_huddle_main_server> --electrify-allow-regexp=".*" -- ninja-linux64 -j 300 -C out/Release chrome

In order, the arguments mean:

Adjust as necessary. The documentation for all our products is located at http://docs.electric-cloud.com. The documentation for Electrify is in the ElectricAccelerator Electric Make User Guide.

Couple more details. I said Electrify can be told to keep selected processes from being sent to the cluster. You do that with the --electrify-not-remote=<x;y;> or **--electrify-deny-regexp=

** command line options, explained in the User Guide. Also, if you are familiar with Electric Insight and its fantastic visualization capabilities into your builds (see below), you will be disappointed to know that Electrify does not produce the annotations necessary for Insight more advanced features, such as ElectricSimulator. Note that all of this can be done with the peer-to-peer ElectricAccelerator Huddle product that is currently in open beta. Finally, if you do have Electric Insight, this is what you want to see as a result of a Ninja build parallelized with Electrify. It's a thing of parallelized beauty, the brick wall. :smile: This is the full clean Google Chrome build, shrunk to a mere 9 minutes and a handful of seconds. **And the really awesome part of the story? The acceleration was done with agents on virtual machines running on Skytap (http://www.skytap.com) at a fraction of the cost of standing up, maintaining and operating an infrastructure. From a template, I had a 96-core cluster up and building in minutes. How awesome is that!?** ![image](https://cloud.githubusercontent.com/assets/6594563/8421958/433f01a6-1ea3-11e5-9631-50bd334105f2.png) Any questions? http://ask.electric-cloud.com is your friend. Have fun with your new warp speed engine! Juan Jimenez Sr. Solutions Engineer Electric Cloud, Inc.