realm / realm-core

Core database component for the Realm Mobile Database SDKs
https://realm.io
Apache License 2.0
1.02k stars 158 forks source link

Move build system to CMake. #1543

Closed simonask closed 7 years ago

simonask commented 8 years ago

As discussed on today's Core meeting, we have a number of significant pain points, that may or may not be solved by changing build systems.

The purpose of this ticket is to discover exactly what we would gain and lose by switching to alternatives, and hopefully to gather input from all the people that are affected, including bindings and CI.

The current best candidate for an alternative is CMake, but other suggestions are also welcome (I personally don't know much about them, but Gyp, Gradle, SCons have been mentioned — if someone wants to make a case for those, this is also warmly welcomed).

I want to stress that keeping the current build system is a possible outcome of this, but in a way that entails fixing the pain points listed below at least in part. Please do not hesitate to weigh in with the benefits that we get from the current build system, because far from everybody on the team is aware of them all.

The primary pain points that must be addressed by the outcome of this discussion are:

On the other hand, we do gain at least one major benefit from the current system:

@rrrlasse @finnschiermer @ironage @kspangsege @danielpovlsen @teotwaki @tgoyne @bdash @jpsim @cmelchior @emanuelez @kneth @kristiandupont @fealebenpae

emanuelez commented 8 years ago

Very good initiative.

I would like to state some facts that I believe have some value.

The first one is the size of the current build system:

$ find . -name build.sh -o -name '*.mk' | xargs du -hc
108K    ./build.sh
4.0K    ./src/config.mk
 92K    ./src/generic.mk
4.0K    ./src/project.mk
4.0K    ./test/android/jni/Android.mk
4.0K    ./test/android/jni/Application.mk
4.0K    ./test-installed/generic.mk
4.0K    ./test-installed/project.mk
224K    total

The second one is about divergence. As we know sync copied its build system from core. We can measure how much build.sh differs right now:

$ diff realm-core/build.sh realm-sync/build.sh | diffstat
unknown |  341 ++++++++++++++++++++++++++++++----------------------------------
1 file changed, 164 insertions(+), 177 deletions(-)
kneth commented 8 years ago

CMake is supported by major shells and editors: https://github.com/Kitware/CMake/tree/master/Auxiliary

simonask commented 8 years ago

@fealebenpae You seemed to have some very useful input regarding this, do you perhaps want to share here? :-)

fealebenpae commented 8 years ago

@simonask I'm high on seeming, but low on substance ;)

I've used CMake to build an Objective-C library precisely because it makes it very simple to produce binaries for iOS, macosx, tvOS and watchOS without having to manage duplicate targets. It also makes it straightforward to produce a static library or Cocoa Touch Framework from the same CMake target, too.

IMO CMake would be great for Core as it would simplify building across the major toolchains - Xcode, Android makefiles, Visual Studio (:stuck_out_tongue_winking_eye:). I would go as far as to suggest that ObjectStore and the .Net P/Invoke wrappers would benefit from switching to CMake just as much.

The only real problem with CMake is how you integrate it in other projects - you can add the CMake-generated .xcodeproj to your workspace and wire up dependencies, but you'd have to do this every time you regenerate. You can circumvent this by adding an Aggregate target in Xcode which will invoke CMake and pass the correct build settings inferred at build-time - SDKROOT, ARCHS, etc. A similar approach would apply for Gradle and MSBuild as well. In fact, that's how I ended up building WebKit so I could link against JavaScriptCore. Actually, I would direct anyone interested to WebKit as a great example of CMake use.

Oh, and Clion is an excellent IDE built around CMake.

kneth commented 8 years ago

I have found a guide on how to use CMake to build node.js extensions: https://www.npmjs.com/package/cmake-js

simonask commented 8 years ago

Alright, so our efforts to experiment with CMake have resulted in the branch su/cmake-experiment. Please feel free to check it out and experiment! One of the things it does is it uses an Xcode project to build for iOS/watchOS/tvOS instead of Makefiles.

Part of the branch, which is also up for discussion, replaces build.sh with a Rakefile. If you have Ruby installed on your system, you should be able to get up and running with familiar commands such as:

$ rake check-debug
$ rake lcov
$ rake build-cocoa

# See available commands:
$ rake -T

check-debug and friends generate a GNU Makefile build system, and they respond to the normal options such as REALM_MAX_BPNODE_SIZE, REALM_ENABLE_ASSERTIONS, REALM_ENABLE_ENCRYPTION, etc.

By default, a Makefile system is generated in subdirectories such as build.make.debug, build.make.release, build.make.cover. The Xcode project for building for Apple platforms is placed in build.apple. These can be overridden by providing build_dir=<dir> as a command-line argument to rake.

So far the Rakefile is about 200 lines of code, which is a factor 10 less than build.sh. But of course it doesn't support everything that build.sh does (including building for Android, which is still pending). Still, using an off-the-shelf dependency resolution algorithm simplifies a loooot of things that are quite complex today. Still again, we need to carefully examine if the unfamiliar Ruby syntax is a net win.

danielpovlsen commented 8 years ago

I need parallelism. Maybe it is a silly / tiny thing, but this is currently ~an order of magnitude slower than build.sh.

simonask commented 8 years ago

I should have clarified — this is a 100% standard CMake environment, so if you create Makefiles, it's just make -jX. :-)

The Rakefile could easily be extended with the ability to detect number of cores (in the same way as build.sh) and pass that on to make (and also to set UNITTEST_THREADS).

fealebenpae commented 8 years ago

For parallelism on CMake I suggest using Ninja. It's easier to setup on Windows than Make, too. But yeah, make -jX works as well.

rrrlasse commented 8 years ago

Except for minor issues (pthread.c wrapper not being in default search paths, network.cpp included in project even though it shoudn't), it worked flawlessly with Visual Studio. Only tried check-debug, though.

morten-krogh commented 8 years ago

Hi

I was reading a bit about build systems. Here is a possibly relevant video about Unity migrating their c++ and c# build system to Gradle.

https://www.youtube.com/watch?v=jmadc8xI_6I&noredirect=1

Their situation seems similar to ours. They have a lot of c++ code and they need to support various platforms.

I do not know if they made the right choice, but the speaker claims so.

The video also talks about some alternatives they considered. They migrated from Jamplus. They had to work together with Gradle to improve the native code support.

simonask commented 8 years ago

@danielpovlsen I have added auto-detection of #CPUs.

@fealebenpae I have verified that building with Ninja also works. We might consider this for CI, although it only makes a minor difference compared to GNU Make on my machine:

make -j8         212.64s user 14.14s system 605% cpu 37.468 total
ninja-build -j8  213.13s user 13.62s system 623% cpu 36.356 total

@morten-krogh Very interesting perspectives, although their problems seem massively more complex than ours. In particular I found her remarks on migrating build systems insightful.

danielpovlsen commented 8 years ago

Cool. However, it did not work on (my MacBook's) OS X. It seems as though uname does not return a clean string here as adding .chomp does work (on this machine):

task :guess_operating_system do
    @operating_system = `uname`.chomp
end
danielpovlsen commented 8 years ago

Pushed the fix

ironage commented 8 years ago

Very nice! You have my vote to move to CMake. Here are a couple of minor things I noticed while trying it out: 1) It would be nice for rake to have a default so just running "rake" would output possible targets like "rake -T" does for example 2) Pressing ctrl-c to abort "rake memcheck-debug" does not actually stop memcheck from continuing to run in the background 3) There are several targets supplied by build.sh which are not yet supported by this new configuration. In particular we would have to look closely at what jenkins requires, but I suppose it would be trivial to map out what is needed when changing over. I just want to be cautious that we don't lose functionality in the migration.

    jenkins-pull-request:               Run by Jenkins for each pull request whenever it changes
    jenkins-pipeline-unit-tests:        Run by Jenkins as part of the core pipeline whenever master changes
    jenkins-pipeline-coverage:          Run by Jenkins as part of the core pipeline whenever master changes
    jenkins-pipeline-address-sanitizer: Run by Jenkins as part of the core pipeline whenever master changes
simonask commented 8 years ago

@ironage: I added support for most of the Jenkins targets just now, but I think more adjustment is needed to achieve full feature parity. It isn't really clear to me exactly what's needed in Jenkins, but it should be easy enough to figure out. :-)

danielpovlsen commented 8 years ago

As discussed elsewhere, I think we should strive to make this a 99,9% drop-in replacement for build.sh, such that nothing else (CI mainly) would be required to change simultaneously. That is, I prefer to change only one thing at the time. Also it would make it easier to revert to the old build system (temporarily) in case this needs more development time.

teotwaki commented 8 years ago

I understand that reaching feature parity would be interesting, but then again, we're trying to solve pain points for CI, so replicating those pain points might be counter productive.

teotwaki commented 8 years ago

With regards to migration of CI, I think the easiest thing to do is to leave the Makefiles and build.sh where it is, but commit the CMake stuff alongside it (there is no naming collision AFAICT). When things are merged, we can slowly start bringing CI up to speed using the new system, while keeping the fallback available if things go wrong. I don't believe there would be a point in removing the current build system until everything has been migrated.

danielpovlsen commented 8 years ago

@teotwaki Obviously we are not trying to replicate the pain points. This is not the straw man that you are looking for. ;-) We're trying to have a smooth migration to the new system without bringing down (any part of) CI for extended time periods. In general we want to avoid any interruptions to our development process. If I understand you correctly, you want to do the same, but in a slightly different way. Which ever is the safest path to avoid interruptions, I am all for it. :-)

simonask commented 8 years ago

I agree with @danielpovlsen, and there's a lot of value of in being able to switch back and forth without having to make any changes in CI. I think we can quite easily emulate everything that build.sh does, and then enable ourselves to solve the CI pain points later. :)

simonask commented 8 years ago

I have collected the pros/cons that we have so far discovered with our experiments with CMake. However, I am sure that we are lacking information about a lot of features that we just aren't aware of that we currently get from generic.mk, so it would be really great to get more people's input.

@kspangsege It would be really great if you could read through the list of pros/cons and see if they match your perspective. We need to uncover any blind spots that we have, and I'm sure there are at least a couple, so that we are able to make an informed decision.

Here is the document: https://github.com/realm/realm-wiki/wiki/Decision-document-build-system

morten-krogh commented 8 years ago

Nice initiative!

I have some more general thoughts about build systems.

Firstly, the build system is really part of a bigger system of tasks that include unit testing, creation of IDE project files, installation, creation of libraries, CI, deployment, and external dependencies. In my opinion it makes sense to unify all of this in some bigger scheme.

The entire chain of building and deployment can be seen as consisting of a lot of operations and dependencies.

Every operation, or task, has a set of inputs and dependencies, and produces a set of outputs. These inputs and outputs can be files without loss of generalities.

The operations can be written as stand alone executables. These executables can then be written in any language or framework, they are completely independent of each other, and users need not understand the inner workings.

At the highest level, the targets could be organized in a Make file, or Rake file, or anything similar. The make file would basically only contain phony targets.

The reason that targets would be phony is that dependencies can usually not be separated from the task (recipe, operation) itself. For instance, it is much easier for a running build script to find stale dependencies than it is for make which can only use time stamps on files. For instance it is basically impossible, or requires absurd hacks, for Make to detect that a source file has been removed and make sure that the build directory is not littered with object files of old source files. What file time stamp would make use to see that a source file is gone. I am talking about a general make script that does not explicitly list all files. The need to clean all during development is basically a sign of a deficiency in the build system.

So, make should only keep track of dependencies between phony targets and let executables do the real work, IMO.

The system will be used as follows

make clang-build-debug make clang build-release make clang-link-tests-debug make gcc-build-debug make clang-debug-unit-test make xcode-project-file make visual-studio-project-file make gcc-valgrind-check ... ... make core-library-for-ios make core-library-for-android make jenkins-test

and finally

make deploy-all-of-realm-core // This one

The make file would typically look like

clang-debug-unit-test: clang-link-tests-debug executable-that-runs-the-tests-and-outputs-a-report

The make file would be easily readable by everybody, easily extensible, and give anyone a simple overview of the chain from source code to deployment.

The executables are independent and loosely coupled. the coupling is only some simple contracts such as the directories and file types of the various steps.

How would the executables look. A nice executable would be this one.

./build src-dir target-dir compiler-with-flags

for instance

./build src/realm build/clang-debug "clang -Wall -O0"

./build would then take all cpp and hpp files in the src-dir and compile them and leave them in build-dir. There would be one .o file per .cpp file. The build-dir would be made with "mkdir -p" or similar, and .o files would be named and put in sub directories as an exact mirror of src-dir. There would be no stale .o files in build-dir. For instance, one could rename a .cpp file and type make and be done with it. ./build should also guarantee that the compiler flags are correct. In order to avoid unnecessary recompilation, build would need its own cache file, most suitably placed in build-dir.

./build could be made from the current make system or from CMake, or simply by someone writing a reasonably simple script or c++ executable.

Now the make file would look like

clang-build-debug: build src/realm build_clang_debug "clang -Wall -O0"

clang-build-release: build src/realm build_clang_release "clang -Wall -O3"

etc.

Anyone can use this, change compiler flags, and introduce new build types such as clang-build-debug-2 Compiler flags are documented in a very simple way, and are not hidden in files deep in the directory tree, and are not transmitted through environment variables. Am I the only finding the use of environment variables in the build system a bit ugly? I actually had the CFLAGS env variable set in my shell from some earlier project. When I compiled Realm core, I then compiled with other flags than the intended ones. Kristian fortunately realized what had happened, but it illustrates the danger of injecting implicit global variables in the build system.

Make might be too simple even for this simple use. A nice features would be to be able to write

make target_1 -target_2 -target_3

with the meaning that the human guarantees that target_2 and target_3 are up to date.

Another nice feature would be additional arguments to make. For example,

make unit-test test-25 test-27

I am not suggesting that we should implement this right now even though I think it is simpler than it might sound from this long post. With such a system, there is no need for every member of the team to know details of make files or CMake files. Anyone has full power. And new targets can be introduced in a completely loosely coupled way; if someone gets a new IDE and has a perl script that generates project files, it trivially fits into our system.

simonask commented 8 years ago

@morten-krogh Thank you for your input. Two observations:

  1. I think that time has shown that build systems for C++ projects are rarely as simple as they appear. If they were, people wouldn't keep implementing new systems. :)
  2. We have to keep our eyes on the fact that we are not in the business of developing build systems. This means that the benefit that we gain from building basically anything ourselves has to be quite huge to justify spending time on it, compared to using a standard solution.

if someone gets a new IDE and has a perl script that generates project files, it trivially fits into our system.

This is only superficially true. How confident are we that setting up Perl (or even Perl 6) is "trivial" on exotic platforms that we have to support, such as Windows? ;-)

Arbitrary flexibility and simplicity are contradictions in terms, in my view (this might be controversial, but nevertheless).

teotwaki commented 8 years ago

The problem is that make is not supported on all platforms that we intend to support in the long run, and is too simplistic for any real kind of work when it comes to cross-platform stuff. I've written and maintained very simple and efficient Makefiles in the past (2k LOC, for a 500k LOC source), and was really happy with the results. However, having to encode all the specifics of every supported platform in the Makefiles themselves was a nightmare, and we were only targeting Linux distributions.

I would really dislike a build system where I had to manually tell it which compiler and flags to use. I honestly believe that the UNIX way of doing things is extremely strong, and supports an extremely versatile workflow that fits many, many use cases. I set my own compiler flags regularly, and I expect the build system to honour those choices that I made, because I know better than the build system what I am trying to achieve.

I completely agree with you that it is not every person's job to be intimately familiar with the build system. However, I do believe that every one on the team can be reasonably expected to have sufficient know-how to maintain the build system every once in a blue moon; if only one or two people are experts, they are a bottleneck. They are a risk to the company. If those people decide to leave the company, we will need to expend a tremendous amount of effort to train new people to be the experts. I don't think this is a viable strategy in the long run.

By using a standard tool, one where solutions can be searched for online, we reduce the amount of risk to the company. Yes, this does mean that some people will need to slightly change their habits, but then again, that's the defining trait of collaboration: adopting a common ground so that everyone can move forward. There is another massive advantage to using a standard tool: Everyone in the team benefits from it, on a career level. As an individual, you'll be able to bring your knowledge of a build system onto your next job or company, and that knowledge will pay dividends until the rest of your days (or as long as that build system is relevant). This is at the complete opposite of using a custom build system, which by definition, will never be re-used outside of the company (every company has developed its own build system at one point or another, and none of them have seen the light of day).

Am I the only finding the use of environment variables in the build system a bit ugly?

Probably not, although I think that it makes perfect sense when you are trying to indicate how your system behaves.

To give you a simple example, I have the distribution-provided compilers (clang and gcc) in /usr/bin, but I also have ccache aliases in ~/bin, and custom versions of the compilers in ~/bin/$compiler-version. On our production systems, the right version of GCC is installed in /opt/devtoolset-3/root/usr/bin. As a sysadmin, I need to be able to inform the users of the system how the system is setup, because there isn't a one-size fits all solution that works across the board. This is literally the problem that environment variables solve. As a sysadmin, I can indicate which compiler is preferred, or where the system compiler is installed, and users can override this definition if they want or need to.

I realise that this is a fairly specific issue, but to me, it is a very real one. This is even more true in a world where we have to test multiple compilers in multiple environments. How would a custom Makefile handle new versions of Android? Do we create new targets? What about the old ones? When does the default change?

We have to keep our eyes on the fact that we are not in the business of developing build systems.

Simon hits the nail on the head for me here. :100:

bmunkholm commented 8 years ago

In progress now to move to CMake.