ldc-developers / ldc2.snap

Snap package definition for LDC, the LLVM-based D compiler
11 stars 4 forks source link

Add CircleCI build config #63

Closed WebDrake closed 5 years ago

WebDrake commented 5 years ago

This patch adds a basic CircleCI config file in line with the suggested setup provided by https://tutorials.ubuntu.com/tutorial/continuous-snap-delivery-from-circle-ci

This version implements only the build stage, and does not attempt to push releases to the snap store; this will still be taken care of for now by the Launchpad build system. Its intended usage is to test PRs and the current state of release branches.

This is a copy of https://github.com/ldc-developers/ldc2.snap/pull/59 rebased on and retargeted at the 1.10 branch to avoid potential CI issues.

WebDrake commented 5 years ago

Whoops, fixed the indentation issue:

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 97eb0d7..892229d 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -19,7 +19,8 @@ jobs:
         command: snapcraft

     - persist_to_workspace:
-      root: *workdir
+        root: *workdir
+        paths: ['*.snap']

 workflows:
   version: 2
WebDrake commented 5 years ago

Well, this is freaky:

FAILED: /usr/bin/c++   -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/root/workspace/parts/llvm/src/lib/Support -Iinclude -I/root/workspace/parts/llvm/src/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3 -DNDEBUG    -fno-exceptions -fno-rtti -MMD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o -c /root/workspace/parts/llvm/src/lib/Support/Error.cpp
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
FAILED: /usr/bin/c++   -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/root/workspace/parts/llvm/src/lib/Support -Iinclude -I/root/workspace/parts/llvm/src/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3 -DNDEBUG    -fno-exceptions -fno-rtti -MMD -MT lib/Support/CMakeFiles/LLVMSupport.dir/DeltaAlgorithm.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/DeltaAlgorithm.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/DeltaAlgorithm.cpp.o -c /root/workspace/parts/llvm/src/lib/Support/DeltaAlgorithm.cpp
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
FAILED: /usr/bin/c++   -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/root/workspace/parts/llvm/src/lib/Support -Iinclude -I/root/workspace/parts/llvm/src/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O3 -DNDEBUG    -fno-exceptions -fno-rtti -MMD -MT lib/Support/CMakeFiles/LLVMSupport.dir/GlobPattern.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/GlobPattern.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/GlobPattern.cpp.o -c /root/workspace/parts/llvm/src/lib/Support/GlobPattern.cpp
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.

Anyone have any experience of something similar in other CircleCI LLVM builds?

I've tried setting an explicit -j3 in the ninja command to match what's done in https://github.com/ldc-developers/llvm/blob/ldc-release_70/.circleci/config.yml, just in case there's an issue arising from the total numbers of jobs, but that seems an unlikely culprit.

WebDrake commented 5 years ago

The other setting that I suppose might make a difference is this: https://github.com/ldc-developers/llvm/blob/ldc-release_70/.circleci/config.yml#L48

... which I've never used in snap-package LLVM builds. Could that be the culprit?

WebDrake commented 5 years ago

I've got rid of the ninja changes and instead used a -DCMAKE_CXX_FLAGS=-static-libstdc++ cmake flag to see if that makes the difference. Seems more likely given the kinds of errors we're seeing.

WebDrake commented 5 years ago

Dammit, no:

[ 20%] Building C object projects/compiler-rt/lib/builtins/CMakeFiles/clang_rt.builtins-i386.dir/udivmodsi4.c.o
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/build.make:206: recipe for target 'utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/CodeGenDAGPatterns.cpp.o' failed
make[2]: *** [utils/TableGen/CMakeFiles/obj.llvm-tblgen.dir/CodeGenDAGPatterns.cpp.o] Error 4
make[2]: *** Waiting for unfinished jobs....
WebDrake commented 5 years ago

It's the exact same g++ version with which I've successfully built the snap package on my own machine,

g++ (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609

... so very much not clear to me why I'm getting ICEs on CircleCI and having all running fine locally.

WebDrake commented 5 years ago

Updated to use snapcraft --no-parallel-builds to see if we can reduce resource consumption and so avoid cc1plus getting killed.

WebDrake commented 5 years ago

OK, this looked like it was working, but maybe got killed for running too long (120 minutes seems close enough to the timeout ballpark mentioned by @kinke in the previous PR). It was in the middle of compiler tests when it was killed.

Assuming we can't increase the timeout, my feeling here is that the right thing to do may be to remove or at least reduce the numbers of tests. This is a shame as they have in the past solved issues, but it really adds a LOT to the build time, and I can't help but feel that many of these tests might best be run as acceptance criteria for the packaged compiler, rather than a precursor to finishing the package.

Thoughts, anyone? @JohanEngelen you and I have talked a lot about the test setup -- could this be a workable approach in principle?

WebDrake commented 5 years ago

I think I've found a compromise, which is to use an environment variable to indicate if the tests should be skipped. This can be used in CircleCI config without disabling the tests in the final Launchpad build that uploads the package.

WebDrake commented 5 years ago

... oh FFS, this is really starting to annoy me now; things that AFAICS were working in earlier versions of the CircleCI setup are now failing, for no apparent reason.

The first build where I introduced --no-parallel-builds got as far as tests before falling over; now things are consistently failing during building of the compiler itself. I'm guessing this is out-of-memory stuff (unless a timeout is being hit, but that would be much less time than was allowed previously).

kinke commented 5 years ago

Circle seems to report the full number of host CPUs, not the virtual CPUs we get. [This can be seen in the lit output: -- Testing: 253 tests, 36 threads --]. Ninja uses that number to determine an appropriate number of parallel jobs if you don't specify -j<N> explicitly, and IIRC there's just ~8 GB of memory.

kinke commented 5 years ago

The Azure pipelines don't have that problem, and also allow for 6h timeouts, so that you can run all tests again. I'm playing around with it for regular CI (https://github.com/ldc-developers/ldc/pull/2998, https://github.com/ldc-developers/llvm/pull/4), and it's very promising. Give me a few days for completion, then I'll enable it for this snap repo too.

WebDrake commented 5 years ago

@kinke yup, that's why I introduced --no-parallel-builds, which causes snapcraft to invoke cmake with only 1 build job at a time. You see the difference -- without it, -j36 is used (!) and the job is quickly killed, presumably because it breaches memory limits (which IIRC are about 4 GB, not 8).

But re-looking at output, I realized something: because in the LDC build I invoke ninja directly, rather than letting snapcraft do it for me, that constraint isn't respected and ninja eats up all the CPUs it can get. I have a few ideas about how to address that other than just hardcoding -j 1, so I'll try that now.

I was just about to ask about azure pipelines: it sounds like a potentially nicer option, so let's follow up when you are comfortable with how you have set things up for everything else. Thanks for offering!

WebDrake commented 5 years ago

For the record, the idea here was to see if there was a convenient way to get the docker image building with snapcraft 3.1+, which properly supports ninja as a cmake backend. This should allow me to simplify the override-build and avoid the unintended bypassing of the --no-parallel-builds flag.

The last patch tries a manual workaround, which is a bit annoying but should hopefully work in the short term.

WebDrake commented 5 years ago

Oh FFS, now it's running the compiler tests despite the environment var to stop it being explicitly set.

@kinke is it OK if I disable CircleCI again for this project? I think it's clear by now that it's probably better to try with Azure Pipelines when that's ready, rather than keep trying to add workarounds to support CircleCI.

(Just to know: is stopping the integration better done via "Stop Building" on the CircleCI settings side, or by deactivating the webhook GitHub side?)

Iain Bucław also suggested to me that SemaphoreCI might be a good option. Would there be any objection to me trying to enable that solely for the snap package?

kinke commented 5 years ago

is stopping the integration better done via "Stop Building" on the CircleCI settings side, or by deactivating the webhook GitHub side?

I don't know, never removed a service so far. ;) - But I guess it doesn't really matter, especially if there's no .yml file.

SemaphoreCI might be a good option

Semaphore v1 (which we also use) has by far the best single-core performance of all services; I don't remember their timeout. I don't know how long they are going to support v1. v2 isn't a viable alternative for open-source projects (yet), as the max time per month is limited by the free 20$ credit (~42 mins per day); see https://github.com/ldc-developers/ldc/pull/2972.

WebDrake commented 5 years ago

Closing unmerged, as the Azure Pipelines functionality is so obviously better suited to this task.