crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.38k stars 1.62k forks source link

Continuous Integration Infastructure #3721

Closed RX14 closed 3 years ago

RX14 commented 7 years ago

Currently, crystal uses Travis CI for continuous integration. This works well, but has some limitations. Travis currently allows us to test on our major architectures: linux 64 and 32 bit, and macos. However, in the past year we have gained ARM support in 32 and 64 bit, as well as support for freebsd/openbsd. These architectures would be difficult to test using travis. Without continuous integration on a target triple, that triple is essentially unsupported and could break at any time. In addition, travis lacks the ability to do automated releases. This makes the release process more error-prone and precludes the ability to do nightly releases.

I have been working on setting up Jenkins as a replacement for travis. Jenkins is a much more flexible system, as it allows connecting your own nodes with their own customised build environment. For example we could test crystal on an actual raspberry pi for every commit. We could also schedule jobs to create nightly builds, and authorised users on the web interface could kick off an automated release process.

Currently I have a test jenkins instance running at https://crystal-ci.rx14.co.uk/, here is a (nearly) passing build. Jenkins builds can be configured by a Jenkinsfile in the repository like travis. Here's the one I made for crystal. I've documented the setup for the master and slave instances here. Currently i'm thinking of running every slave in qemu/kvm on a x86_64 host for consistency between slaves. Automating slave installs using packer seems trivial.

There are quite a few different options for jenkins slaves however. It's possible to create jenkins slaves on the fly by integrating with different cloud providers. This has the added benefit of the environment being completely from scratch on every build. It also may be cheaper, depending on build length, commit frequency, and hardware constraints. It might also be wise to mix this and the previous approaches, for example using some raspberry pis for arm, a long-running VM for openbsd, and google compute engine for the x86 linux targets (musl in docker?).

Rust seems to use buildbot instead of jenkins, but Jenkins has really surpassed buildbot in the last year in terms of being a modern tool suitable for non-java builds (released 2.0, added jenkinsfile, seamless github integration like travis). I also have 3-4 years experience working with jenkins, but have never worked with buildbot before.

The problem I have in proceeding is that I don't know the options and preferences @asterite and Manas have in terms of infrastructure and how they would like this set up. Before I sink too much into creating qemu VM images to run on a fat VM host.

TL;DR: CI on every target triple? Nightly builds? Yay! Now how do I proceed?

matiasgarciaisaia commented 7 years ago

I think Jenkin's a 64 bit build, and so the issue would be "Cross compiling from x64 to i386 is broken".

bcardiff commented 7 years ago

I would mimic the release process. i.e.: using the latest 32 release to compile the next 32 release. That also match what the user will be doing if contributing to the compiler from a 32 platform.

I don't think the flags (at least the -m32) should be forwarded, because the macro run will execute the compiled program in the original environment so, in this case, the compiler used for macro run should be setup for 64bits.

From the Jenkins log I guess you are trying to cross-compile the specs, there the macro run is used, which end up calling the just cross compiled compiler with a libcrystal.a for 32 bits. Again, I would just compiler the compiler and the specs in 32 bits without cross compilation. But, if not, do not cross compile the the specs at least. I think that should work.

RX14 commented 7 years ago

@bcardiff In the current CI there are 2 docker containers, one containing a 64bit filesystem, one containing a 32bit filesystem. This means that the linker itself is 32bit by default, you don't need to pass -m32. It still executes in a 64bit kernel on travis.

In the new environment, I didn't want to spin up 2 VMs for 32 and 64bit, or use containers (added complexity), so I chose to use debian's multiarch features. This seems to work very well, apart from when macro runs are required. It turns out though that this is impossible anyway because libevent isn't correctly packaged for multiarch so 2 VM images are required.

Using a 32bit crystal release of crystal in the 32bit builds would be a good idea. I think that in the future I would like to make downloading a crystal release part of the build process, instead of baking in the crystal version to the VM image. But I don't think that that should require 2 VMs, or to create a chroot. Crystal should be able to utilise multiarch features to cross-compile for 32bit. Someone will want to do it in the future, so I think we should support it.

I ended up passing -m32 by setting CC=cc -m32, which is probably actually the recommended solution to this problem, however it seems the macro runs don't pass the target to the new compiler instance. I added a quick commit which probably fixes the problem (https://github.com/RX14/crystal/commit/ad941fb28e4cf057f027821ddf3e0f05dfb02e4f) but it's a hack I can't test and it's looking more and more like this is a dead end.

Unfortunately there are no official debian 32bit AMIs, so i'm going to have to think about how else to do this :(

Sorry for the brain dump, I wrote this comment while working on these workarounds.

RX14 commented 7 years ago

Debian don't provide 32bit AMIs, with a message telling you to use multilib, which is what I tried to use in the previous comment and failed. Getting it to work would require compiling the compiler for x86_64 before compiling for 32bit. This is difficult as it would require both 64 and 32bit development versions of the libraries used, which is difficult in certain cases as some packages still aren't fully multilib compliant. In general debian's 32bit support seems a huge mess right now.

I could try creating a 32bit AMI using bootstrap-vz but there isn't an official manifest for it. If that doesn't work out, I'll have to think of something else.

RX14 commented 7 years ago

Ok, building a 32bit AMI was easier than expected, I even nearly completed a whole 32bit build: https://jenkins.crystal-lang.org/job/crystal/job/feature%252Fjenkinsfile/18/console. It fails to build the compiler due to some linking errors, however. Any suggestions?

I think the next steps are really to merge the Jenkinsfile and get nightly builds going, gradually explanding the available architectures. After that I think i'll try and get PR builds working, which I will probably end up building a bot very similar to rust's bors, but using the jenkins api, passing targets to be built and git sha as build parameters. Thoughts?

RX14 commented 7 years ago

I've added swap space and used --threads 1 to control the memory usage on 32bit. It seems to be working. We now have a full matrix of 32 and 64bit debian using llvm 3.5-4.0. I've created a crystal-nightly job which currently builds my fork nightly.

Next steps are to merge the jenkinsfile to get nightly builds kicked off and get notifications sending on failed builds so that the nightly results don't get ignored. That includes deciding which matrix we should include in the nightly builds.

A weird bug seems to have been exposed though: both 32 and 64bit LLVM 4.0 builds failed with linker errors. A build log is here.

Val commented 7 years ago

A weird bug like https://github.com/crystal-lang/crystal_lib/issues/25 or https://github.com/crystal-lang/crystal/issues/1269 ?

RX14 commented 7 years ago

Arrgh, this is why I need to finish crane and use it on the CI, so that we have a native crystal install and ditch the omnibus.

straight-shoota commented 7 years ago

Is there any progress on this? It would be great to have nightly builds. I think it would help avoiding releases requiring an immediate followup bugfix release when changes in master can be more easily tested in the wild ;)

mverzilli commented 7 years ago

Totally agree. One thing that's certainly missing is a wiki page to specify what kind of support we're providing for each target. I'd like to basically "translate" this page to one in the Crystal repo wiki: https://forge.rust-lang.org/platform-support.html

This page should state our initial goals and not where we currently are, so we can use it as guidance to inform this issue.

I won't have time to start that until the end of next week, so if anyone wants to take a stab at it before, we can iterate from there.

Val commented 7 years ago

For the linking problems #4825 ...

rishavs commented 5 years ago

It is probably too late for this but in case we want to consider Azure DevOps as a cloud hosted CI options, do let me know. I work in the DevOps team.

j8r commented 3 years ago

We have GitHub Actions and Circle CI. I propose it is time to remove Travis (1 check remaining), and duplicated Circle CI checks. (note: same story for shards). There are currently 36 checks running!

RX14 commented 3 years ago

Yeah, Crystal CI's good now :)