HaxeFoundation / Project-Management

Project management and communication
20 stars 4 forks source link

Replace deployment via travis? #53

Closed Simn closed 8 years ago

Simn commented 8 years ago

We currently use travis to deploy various pages:

Unfortunately, due to the way travis builds work, a busy day on HaxeFoundation/haxe and/or HaxeFoundation/hxcpp can easily delay the deployment for several hours.

I would like to discuss alternative ways to approach this. We already have @waneck's builds.haxe.org which runs independently, so maybe this could be extended to other deployments. Other suggestions are welcome too.

jonasmalacofilho commented 8 years ago

@Simn, for similar reasons (and having already used GitHub hooks in the past), I built a simple but flexible integration/testing/deployment server called Robrt.

It still has some issues (is essence, security and cleanup), but it's open-source and will be maintained; my company is using it in two projects,(1)(2) besides some other personal repositories.

If it suits the HF needs, I would happily help with the initial setup and maintenance.

Also, contributions are welcome! :smile:

(1) SAPO: a system for monitoring origin–destination surveys (2) An online and colaborative version of the BRT Planning Guide (proof-of-concept)

waneck commented 8 years ago

@jonasmalacofilho, does it run CI builds as well? E.g. does it log somewhere, send emails if there were failures, etc?

As for the issue itself, I know that the travis delay is also making it more difficult to work with it. We could rent a dedicated server and run something (maybe Robrt?) to deal with the builds. One nice thing about renting a dedicated server would be that we could solve https://github.com/waneck/hxbuilds/issues/4 , provided that we don't allow multiple builds happen when the benchmarks are running. Per Travis' pricing, we could rent as much as 4-6 dedicated server with decent specs for the price of the first paid level. The builds would also run much faster on a dedicated environment, with all the prerequisites already downloaded.

@jonasmalacofilho, if this is something you're interested in working on, I can help you secure Robrt so that we can also run PR builds.

jonasmalacofilho commented 8 years ago

@waneck, it keeps the build(1) log and messages Slack and GitHub (commit statuses, to be more precise) with the build status and other relevant information (all messages as configurable); adding support for email would be trivial.

On the other hand, it so far doesn't use a database; it's entirely file based, and in my deployments the logs (and the deployment) are simply served by a suitably configured HTTP server. This means that without the message (either commit status, slack or email), one wouldn't be able to find the log for the build of a specific commmit, since for that you would need the build id. A simple frontend to it would be trivial, but it's not something I'm inclined to implement solely based on my current needs.

Finally, I would very much like your help in securing it.

(1) The build phase happens after the prepare phase (where a Docker image in built and a container based on it is created) and before the export phase (where the server copies all data to be exported to a configured path); the build log is a simple file that can be easily served, but the prepare and export phases only log to server (journald, if you're running Robrt as a systemd service).

waneck commented 8 years ago

It being file-based isn't at all a problem. In fact, that's how jenkins works, for example. Adding a http server on top of it that only reads from the filesystem and servers the logs in a way they could be more easily explored doesn't sound too hard

As for security, there is a problem of it being docker-based, since the security approach I'm more inclined to use is also based on linux containers, and they wouldn't allow another linux container to be created.

andyli commented 8 years ago

I've a complex feeling about this.

On one hand, I acknowledge the limitations of TravisCI/AppVeyor and I do struggle with them quite often. I've emailed Travis to ask for paid plans that offer a larger number of concurrent builds. Sadly they're quite expensive:

The next step up includes 10 extra concurrent builds, effectively raising the limit to 15, for $500/month, paid annually and in advance. We can bump you to 25 for an additional $500/month, basically $500 for every additional 10 builds.

On the other hand, setting up a custom CI server sound like a lot of work. I know it could be easy to have a simple version running, but in order to get the details right, it would take months. Those details includes:

Maybe reusing open source software (like jenkins) instead of building our own (Robrt or whatever) will help. But it will take time for us to investigate.

Simn commented 8 years ago

There should be two steps:

  1. Deal with the deployment (short term). Here I'm not opposed to a custom solution like Robrt because it shouldn't be very complicated and we can nicely tailor it to our needs.
  2. Move the general unit testing (medium term): I share Andy's concerns regarding complexity. Using something like Jenkins strikes me as the best approach here.
waneck commented 8 years ago

Does jenkins deal with security at all? I reckon that access rights management is something it has, but I don't think we'd be able to activate it on PRs if we go with a jenkins approach

ncannasse commented 8 years ago

I'm quite interested that we found a low cost, scalable CI solution. I have some project in mind which involves doing CI on all haxelib submits - and more :)

nadako commented 8 years ago

I remember @underscorediscovery sounded satisfied with https://buildkite.com/, so maybe he can comment on this.

ruby0x1 commented 8 years ago

Yes, I couldn't recommend buildkite enough. It's essentially "bring your own build script" with a lot of flexibility and workflow on top. You control everything, the agents run on your side (i.e works great for NDA/sensitive stuff, so security is no concern), super easy setup + maintenance, the buildkite agent + backend handle communicating and artifacts (if you choose). No limitations on anything, everything is open source and everything can be API driven. Upgrading agents is a few minutes usually (unless opting to new features), and you can have as many of them as you want. They have many larger clients too.

A key point for me is that the agents are cross platform, so I can do Mac, Windows, Linux, Android, iOS, native builds along with web stuff. You can use it really easily with docker as well, making it easier to have consistent builds.

I've used Jenkins, Travis, TeamCity, tried out Gitlab CI and a myriad of open source options over the years and buildkite is the first one that 1) actually took 5 minutes to setup 2) doesn't need baby sitting. Months and months I have things running stable without even checking.

I've never looked back.

Simn commented 8 years ago

Thanks for the recommendation, I'll check it out.

Simn commented 8 years ago

Unfortunately I'm too much of a Windows user to get anywhere with this. It certainly looks nice though!

ruby0x1 commented 8 years ago

Which part? It's an exe or a batch file... Plus there are docs. https://buildkite.com/docs/agent/windows

Simn commented 8 years ago

Yes I got the agent working, but since we want to run this on a Linux server anyway there's little point in me messing around like that. I'll try on Ubuntu tomorrow.

andyli commented 8 years ago

@Simn You probably registered the Haxe Foundation organization there. Please add me (andy@onthewings.net) in ;)

Simn commented 8 years ago

done

Simn commented 8 years ago

This wouldn't let me sleep, so I got an AWS account and got this running on an instance.

One thing I couldn't figure out yet is at which level we would branch our targets. At first I thought that we would create a pipeline per target, but pipelines are tied to git repositories and always clone from them. Then I thought it would be a step per target, but even steps clone the git repository from what I can tell.

ruby0x1 commented 8 years ago

I'm not too sure on the specifics of what you need exactly. Since the steps just run commands, and can be waited on by other steps, it seemed to fit for my use case that there were N steps for N targets, and a final step to deal with any artifacts post processing before uploading, packaging, etc. I'm not sure of why the repo being cloned is a problem, you can also put the pipelines in the repo in config form. From my uses, it doesn't clone unless it doesn't have a copy to work from for that pipeline, for that agent.

You can filter both pipelines and steps, using a flexible syntax, and steps can be filtered by agent meta data, so I didn't have much trouble keeping it explicit.

I treated a pipeline as one project (or one result), so I viewed those as things like doc output, a native library, some unit tests separate pipelines. I could imagine if I wanted isolation for sanity testing I'd go as far as a docker instance per target, which would make sure that each build was a clean container but I am now just making up stuff based on no real information on the requirements :)

Simn commented 8 years ago

Ideally we would have something like this:

git clone HaxeFoundation/haxe
make
for (target in targets)
    target specific setup (hxcpp etc.)
    run test

That is, we have a common build step for Haxe itself and then some per-target actions. We already have a custom Haxe script for the latter part (tests/RunCi.hxml) which can be configured from environment variables.

What I set up for now is one pipeline with these steps:

This works, but I still wonder why the last two steps have their own "Preparing build folder" operation which checks the git status. It's not a big deal because that's very fast, it just makes me think that I'm not using this correctly.

Simn commented 8 years ago

The Buildkite people sent me an "Anything I can help with?" email and I asked about this specific aspect. Apparent what I'm doing is correct:

This is the correct behaviour :) "Preparing build folder" can do either 2 things:

  1. If the code hasn't been checked out, do a git clone and then update and git submodule, the switch the repo to the right commit
  2. If the code has already been checked, run a git clean, then switch the repo to the right commit

So it doesn't actually clone everything again as I thought it did.

I was talking to @waneck about security and he mentioned https://github.com/waneck/openjail. My current thoughts are that we should setup a sever with openjail and then connect it to buildkite.

Does anyone have any opinions on this? Given the fact that even my crappy AWS is much faster than our current CI process I think we should move forward with this. I don't insist on using buildkite, but I like what it provides so far.

Simn commented 8 years ago

Well, nothing came of this and travis seems to be fine after Andy merged the targets, so I'll go ahead and close this.