travis-ci / beta-features

The perfect place to leave feedback and comments on newly released Beta Features.
56 stars 68 forks source link

Build Stages: Flexible and practical Continuous Delivery pipelines #11

Closed joshk closed 7 years ago

joshk commented 7 years ago

From simple deployment pipelines, to complex testing groups, the world is your CI and CD oyster with Build Stages.

Build Stages allows you and your team to compose groups of Jobs which are only started once the previous Stage has finished.

You can mix Linux and Mac VMs together, or split them into different Stages. Since each Stage is configurable, there are endless Build pipeline possibilities!

This feature will be available for general beta testing soon... watch this space ๐Ÿ˜„

We love to hear feedback, it's the best way for us to improve and shape Travis CI. Please leave all thoughts/comments/ideas related to this feature here.

Happy Testing!

mheiniger commented 7 years ago

Feature Request: Pause feature

We have the different stages test, stage and production. Between stage and production we need to do some manual tests which can not be automated (for example some user acceptance tests on the UI) and also some stakeholders have to agree.

The reason to use the same build/branch for all stages, is that we use exactly the same artifact (Docker-image) for all stages so we will have exactly the same on production that we tested on the former stages. It would also be a waste of time to build the artifact again from scratch for every stage.

How about

That one missing feature why we're currently doing the test-stage on travis and stage/prod-stages on snap-ci (which is being shut down now).

keradus commented 7 years ago

:-1: for introducing that as general feature, your flow is a bit crazy (nothing wrong about it!) and IMO it is too specific to go as regular feature. Create a stage called "wait for manual approval", that will call some external resource you have in control... in pseudocode: `while(! gotManualApproval) travis_wait 30 ; So the build will fetch external resource to see was the PR manually aproved every 30 seconds until it got approval, then stages passed and Travis continue to next stages. You just need to provide external resource that could say was the PR manually tested/approved or not yet.

I assume it could also not pass the tests/approval process, so you will have 3-value logic like "pass, fail, waiting"

mheiniger commented 7 years ago

@keradus While the flow is maybe crazy regarding to a small project, its quite usual in bigger companies, where you can't just deploy everything on production everytime a developer thinks his code and his unittests are ready.

And my aim was to have an easy workflow on one UI, not on external systems.

keradus commented 7 years ago

for you Travis is external system. I worked with not one global company, never had to do that kind of tricks. Eg make a QA env that WA could deploy any time anything he wants to and forbid to merge PR if QA approval is not there. Easy. No need to hack CI and make automated tests blocked by manual one.

jnewland commented 7 years ago

This feature is super rad! I tried separating the "deploy" step for one of my projects into a stage, but reverted that since the additional container startup time added ~30s or so to the full end-to-end latency of the test and deploy cycle.

keefbaker commented 7 years ago

Having briefly played with this I'm liking it a lot. A shame each stage gets individually queued rather than queued as a group, but I can understand that.

shepmaster commented 7 years ago

S3 probably not viable for me as any PR's wouldn't get the credentials.

This is a really good point. I wouldn't want PRs from untrusted contributors to have access to my S3 credentials, but I do want to be able to pass artifacts from stage to stage. This feels like something that needs to have a good example shown in the documentation.

BanzaiMan commented 7 years ago

@tgolen For individual issues, please open a separate issue in https://github.com/travis-ci/travis-ci/issues/new. Thanks.

aldochristiansense commented 7 years ago

I set first stage name with setup cron , but why it always run test stages even though i didn't define it in the first place ? please fix this.

MagineCIBot commented 7 years ago

Yeah, would really love conditional stages. Right now they're pretty heavy as we take up slots and then in the script will exit early if the conditions (contents of a commit message) is false.

Would be great if there was a more light weight environment that dictates would stages and scripts would be running based on conditions.

Otherwise this is great! Much obliged!

peterjc commented 7 years ago

Are there any examples using the stages on a mixture of Linux and macOS? For example I would want to run the tests on both operating systems, but only run the deployment once - probably on Linux rather than Mac OS X.

https://docs.travis-ci.com/user/build-stages#Examples does not seem to cover this (right now).

harryw commented 7 years ago

A few points of feedback points from me:

  1. I think as several other people have noted, the way that the jobs from build stages and the jobs from default matrix expansion get appended is pretty confusing. It took me a while to realize why the 'test' stage jobs were running first when I'd specified it as my second stage. I'm sympathetic - I can see it may be difficult to find a good way to let everybody to express what they want in a backwards-compatible and simple way - but this is probably not quite the right solution. Definitely something to think about improving before this feature gets out of beta and becomes another backwards-compatibility issue.

  2. It seems quite painful to set up the 'cache warming' use case in a realistic way. The example just has 3 downstream jobs doing the same thing. I've spent some time trying to set up a build job in one stage, which stores its results (eg. fetched dependencies and ClojureScript compiled to JS) in the cache, and then run several parallel, partitioned test jobs (converting over from a matrix) in a second stage. This didn't work because I've got the jobs set up with separate environment variables in order to specify which test partition they're running, and so the cache keys differ. I'd guess I can probably put the env vars into the 'script' section instead, although that's really verbose. It would be really helpful to have more control over the cache keys for each job.

  3. Now that we have a sequence of jobs, rather than a concurrent set of jobs, the additional boot time is pretty noticeable. As @jnewland mentioned, it's a nice idea to use these build stages to move duplicated work into a prior stage, but in fact due to the boot time it seems overall faster (though less efficient) to duplicate that work, which is a shame.

  4. This is a really cool feature with a lot of potential, thanks for building it! I would love to be able to set up a directed acyclic graph of jobs with defined inputs/outputs to exchange data between them. That's certainly a much bigger feature, and I don't want to get greedy. This is a great step forward!

tenitski commented 7 years ago

Similarly to @harryw I have put env vars into script part to make sure the cache is shared between jobs, however, it makes it hard to figure which test does what.

screen shot 2017-05-19 at 12 28 39

One way to fix it could be by allowing to specify a job name with name or job:

    - stage: test
      env: APP_ENV=test
      name: php_unit
      script: TEST_SUITE=php_unit ./ci/scripts/docker_test.sh
# or
    - stage: test
      env: APP_ENV=test
      job: js_unit
      script: TEST_SUITE=js_unit ./ci/scripts/docker_test.sh
webknjaz commented 7 years ago

@peterjc just leave all existing tests without changes in standard matrix, they'll be executed first, it parallel, as usual. For deploy, add corresponding jobs.include entry and it will run after all tests.

2ndalpha commented 7 years ago

Do all of you deploy to production straight from a PR?

My current workflow: PR build for testing & run tests If PR is green, then in github I merge it to master, build for live, deploy

I don't see a way to have different set of jobs for PR and master.

webknjaz commented 7 years ago

This is how it works, but you may try doing conditionals depending on the environment variables travis-ci provides

keradus commented 7 years ago

you don't need to deploy on PRs, here, it's done only for tags: https://github.com/travis-ci/build-stages-demo/blob/deploy-github-releases/.travis.yml#L8

Since this is exactly deployment section you used to, you could keep all conditions in traditional way here

leandro-lucarella-sociomantic commented 7 years ago

Hi, I don't understand very well how this interacts with install and other parts of the build, and also matrix build.

I'm using docker to build for multiple ubuntu versions, so was using matrix builds for that. My .travis.yml was basically:

# Don't use any predefined language env
# (see https://github.com/travis-ci/travis-ci/issues/4895#issuecomment-150703192)
language: generic

sudo: required
services: docker

env:
    - DIST=trusty
    - DIST=xenial

install: ci/travis-install.sh
script: ci/travis-script.sh

Where travis-install.sh is basically a docker build and travis-script.sh is docker run build. Now I want to also deploy some files. What I want is a "matrix pipeline", ideally like this:

  trusty
        ,--> build image -> compile ---> build package --.
       /                                                  \
setup {                                                    } deploy
       \                                                  /
        `--> build image -> compile ---> build package --ยด
  xenial

Where setup is the checkout, build image does the docker build, compile and build package run in a docker container and deploy just upload the resulting files.

I can't find a way to do this... I'm quite new to travis in general, so I don't know if there is a way to fork and join builds like this, by for example passing artefacts around different VMs. If not, this scheme, even when a bit more wasteful, should be enough too:

  trusty
setup ---> build image ---> compile ---> build package ---> deploy

setup ---> build image ---> compile ---> build package ---> deploy
  xenial

But I can't find a way to do that either. First I tried this:

# Don't use any predefined language env
# (see https://github.com/travis-ci/travis-ci/issues/4895#issuecomment-150703192)
language: generic

sudo: required

services: docker

env:
    - DIST=trusty
    - DIST=xenial

install: ci/travis-install.sh

jobs:
    include:
        - stage: compile
          script: ci/dockerize make all

        - stage: package
          script: ci/dockerize make deb

        - stage: deploy
          script: skip
          deploy:
              provider: script
              script: upload files
              skip_cleanup: true
              on: branch

But I ended up with this:

Test
    DIST=xenial
    DIST=trusty
Compile
    DIST=xenial DIST=trusty (whatever this means ???)
Package
    DIST=xenial DIST=trusty
Deploy
    DIST=xenial DIST=trusty

I noticed that each job seems to be run in a complete new clean environment, and the install script is executed again and again. Also at the deploy stage there are no files to upload because of this. I tried moving the matrix variables to each stage but I ended up with the same results in terms of stages.

I have a lot of trouble trying to figure out how these stages could work for a typical project with a build -> test -> deploy cycle, as every stage starts in a new environment, and then again I don't see how matrix builds and things like install, after_install, deploy, etc. are thrown to the mix.

I like that you are trying to extend travis with stages, it's very nice, but so far I can get my head around how this feature was designed and how to use it :)

webknjaz commented 7 years ago

I'm seeing a bug in UI (I guess). See https://github.com/cherrypy/cheroot/pull/29/files

TL;DR

There's the following config present:

  allow_failures:
    - TOXENV=pre-commit-pep257
  jobs:
    ...
    - python: 3.6
      env: TOXENV=pre-commit-pep257
    - stage: upload new version of python package to PYPI (only for tagged commits)
      python: 3.6
      deploy: ...

It's parsed correctly: https://travis-ci.org/cherrypy/cheroot/jobs/234099625/config

I'm not sure whether this should go into separate bug report. /cc: @BanzaiMan @svenfuchs

BanzaiMan commented 7 years ago

@shepmaster @glensc Sorry about the confusion there. I've reproduced the issue with allow_failures, and opened https://github.com/travis-ci/travis-ci/issues/7789. Thanks.

PrototypeAlex commented 7 years ago

It would be great to see the links between jobs and builds in the UI in the running tab.

screen shot 2017-05-20 at 7 36 29 pm

There's a hint with the numbering system but having them tied together as a hierarchical design (some nested design intention) would be great to easily see all jobs running under a build.

Love the build stages though, such a good feature.

GlenDC commented 7 years ago

I love the build stages, works smoothly and nice!

Only one request. Would it be possible to provide a short description, that could be shown above each job, this would make it very clear for anyone which job failed, because now it's just cryptic job 1.1, job 1.2, job 1.3... etc. Right now you have to go in the travis file to know which step is which (as they do appear in order), so could be nice to be able to provide a description.

Either way, seems to work for my Go tests! So thanks ๐Ÿ‘

asilversempirical commented 7 years ago

What's the preferred way to run a script, e.g. a deploy, after all jobs and stages are successful? AFAICT the only way to do this is to have a stage just for that. I would expect after_success to do what I want, but that ends up running once per job, not once per build. That seems a bit at odds with the documentation in https://docs.travis-ci.com/user/customizing-the-build#The-Build-Lifecycle.

BanzaiMan commented 7 years ago

@asilversempirical Yes, the documentation is still being updated with these details. Do you mind opening a separate issue about this documentation issue? Thanks.

thibaudcolas commented 7 years ago

I was hoping to leverage build stages to run the job most likely to fail first, then the others โ€“ thereby having less jobs running for nothing when one (say linting) caught a silly mistake. From what I understand though, the "most important stage" jobs are competing for resources with later stages that were started in other branches (like any Travis jobs without the stages features).

Would there be a way to limit concurrent builds per stage to help with this?

peterjc commented 7 years ago

@thibaudcolas you can use the first stage for linting (e.g. one build) and a second stage for the main testing (e.g. three builds at once).

For a concrete example, https://github.com/biopython/biopython/commit/4154a48c1531582d332bb1b8d4050838bdc6969b uses three builds in the "basics" stage for quick tests including linting, then a "test" stage for the actual seven builds doing functional tests which are much slower: https://travis-ci.org/biopython/biopython/builds/232880583

thibaudcolas commented 7 years ago

Thanks for the example! I think my question comes down to the prioritisation of jobs across two builds โ€“ for example, given the two builds below from biopython, I would like to say "I allocate three concurrent jobs to stage Basics, and two concurrent jobs to stage Test" so the high-priority builds are less likely to be held up by low-priority ones โ€“ and in this example, 3584.1 would start concurrently to 3583.4.

3583 3584
peterjc commented 7 years ago

@thibaudcolas my understanding is that all the jobs in stage N have to finish (and pass) before any of the jobs in stage N+1 are started.

arnaudbrejeon commented 7 years ago

Is http://lint.travis-ci.org supposed to support jobs? I didn't manage to get any file validating properly. I get the following: "unexpected key jobs, dropping"

pawamoy commented 7 years ago

A bit of feedback:

peterjc commented 7 years ago

Good idea from @Pawamoy about setting allow failure as a boolean - that would be much easier with the more complicated .travis.yml resulting from using stages.

pawamoy commented 7 years ago

Also a note about fast_finish:

webknjaz commented 7 years ago

@Pawamoy There's an issue with allow_failures itself (see travis-ci/travis-ci#7789).

In fact you can specify after_success for your specific stage.

lumag commented 7 years ago

I've also hit the issue when compiler matrix will work correctly for first job, but will be used incorrectly for all other jobs (see https://travis-ci.org/lumag/odp/builds/235764701)

BanzaiMan commented 7 years ago

@lumag Each jobs included in jobs.include and matrix.include must have a complete definition (https://docs.travis-ci.com/user/customizing-the-build/#Explicitly-Included-Jobs-need-complete-definitions); because they inherit the top-level values that are specified otherwise, you need to specify a usable scalar value for compiler in your case.

lumag commented 7 years ago

@BanzaiMan so, to run the same configuration for different compilers I will have to duplicate the script?

lumag commented 7 years ago

@BanzaiMan anyway, something is wrong here. Because first job in jobs.include actually picks the compiler matrix and splits into two jobs.

BanzaiMan commented 7 years ago

@lumag No, it does not. The first two jobs in the build you pointed to come from the matrix expansion of compiler: [ gcc, clang-3.8 ]. The remaining 4 come from jobs.include.

BanzaiMan commented 7 years ago

@lumag You don't have to duplicate script, since it is not a matrix expansion key. But it will be inherited from the top level, if it is defined there.

lumag commented 7 years ago

@BanzaiMan ah, thanks, I got it now. It seems I will end up with my main testing script used as top level script (which goes through matrix expansion and so builds with different compilers) and additional jobs being part of the jobs.include. Thank you!

skeggse commented 7 years ago

Do build stages support embedded matrix expansion? e.g. should these configurations be equivalent/similar?

# ...

jobs:
  include:
    - script: echo pre-branch
    - stage: example
      script: echo $var
      env:
        matrix:
          - var=3
          - var=4
# ...

jobs:
  include:
    - script: echo pre-branch
    - stage: example
      script: echo $var
      env:
        - var=3
    - script: echo $var
      env:
        - var=4
ujovlado commented 7 years ago

I am wondering if something like this will be supported:

sudo: required

language: bash

services:
  - docker

jobs:
  include:
    - stage: test
      before_install:
        - docker -v
      before_script:
        - docker-compose -v
      script:
        - echo "test 1"
        - echo "test 2"
      branches:
        except:
          - master

    - stage: deploy
      before_install:
        - docker -v
      before_script:
        - docker-compose -v
      script:
        - echo "test 3"
        - echo "test 4"
      branches:
        only:
          - master

It will be nice to allow almost all "root" parameters (like branch filtering, etc.) Use case: Run certain jobs for all branches (or all except master) and another set of jobs only on master (deploy, etc)

EDIT: looks like simlar to: https://github.com/travis-ci/beta-features/issues/11#issuecomment-290719697

svenfuchs commented 7 years ago

@shepmaster @Pawamoy @peterjc I think it should be sufficient to specify any combination of keys that is uniq per job. However, I agree that this is inelegant. I'll add to our list for potential future improvements: "Allow specifying allow_failure: true per job on jobs.include".

@jeffbyrnes No worries, thanks for letting us know. I'll add Silence skipped commands (no log output) to our list for potential future improvements. I am surprised that you bring this up, I would have thought extra verbosity is a good thing here, but we'll re-evaluate our choice.

@ljharb Ok, thank you. I understand that's how you'd envision it working. I was wondering about the motivation behind this, and what kind of usecase you have. Would you be able to point at an existing .travis.yml file? Or elaborate on the use case?

@alorma Thanks for the kind words! Jobs do not share any storage at the moment. In order to share build artifacts (for example a binary compiled in an earlier stage) you can do so using our artifacts feature (see https://docs.travis-ci.com/user/uploading-artifacts/), or manage the process manually, e.g. https://docs.travis-ci.com/user/build-stages/share-files-s3/. We do intend to improve on this in the future.

@peterjs Thanks for the extra input on this! We will think of a way to make this more convenient. With regards to specifying a different os per job, have you tried something like this https://github.com/svenfuchs/test-2/blob/48c5c8ba1c42b58001842ef9b830e5d35f9f5e2f/.travis.yml?

@keradus Thanks for elaborating your idea. Though it makes sense it is unlikely for us to change our confuration format to this very soon. We might consider this in a future version of the format, once we have a better way of versioning it in place. Your second question will be solved by allowing to specify the stage order as mentioned in https://github.com/travis-ci/beta-features/issues/11#issuecomment-301310134, if I understand you correctly?

@mheiniger Thanks for the input. This feature has been is on our list for future improvements. In fact, it was on the very first whiteboard draft that has driven this current implementation.

@keefbaker Stages do not get queued, jobs do, but only once they're allowed to proceed. All jobs on one stage cannot be queued at once due to our concurrency limits.

@aldochristiansense There shouldn't be a test stage, unless you specify matrix expansion keys on the root level of your configuration, in which case this behaviour is expected. Could you point us to a .travis.yml file (please email support@travis-ci.org).

@MagineCIBot @shepmaster @wearhere Could you let us know what conditions you are interested in? What is your usecase? I'd like to collect this information for our next round of evaluations.

@harryw Thanks a lot for your feedback! 1.) We are listening. We had evaluated various formats, and analyzed their potential for confusion before we decided to go ahead with this one. We understand that it needs to be improved though, and we have some ideas on the list. 2) Yes, adding more control over the cache keys is on our list, too, and is definitely going to be considered. 3) Yes, in many cases it is. 4) Yes, improving the ability to share arbitray artifacts between jobs is high on our list.

@tenitski Thanks for the suggestion. This has come up before and we'll consider allowing to name jobs. In the meantime, since you specify your env vars on the job, have you tried setting an env array in jobs.include that specifies both APP_ENV=test and TEST_SUITE=php_unit?

@2ndalpha If you still have issues setting up your deployements, please email support@travis-ci.org.

@leandro-lucarella-sociomantic This kind of pipeline setup currently is not possible with build stages, at least not exactly like this. You can parallelize jobs per stage, but jobs in the next stage only start after all jobs in the previous stage have finished. So in your example the jobs in the compile stage would only start once all of the build image stage has finished. However, is there any reason not to combine build image, compile, and build package into one job? That way you could achieve the flow you're after. If you still have issues getting this to work please do email support@travis-ci.org.

@PrototypeAlex Thanks for the suggestion, and for the kind words. I'd consider that a feature rather separate from the build stages. But I'll add it to our list for future improvements, and we'll consider it separately.

@GlenDC Yes, allowing to add a name, and potentially a description per job, has been requested before, and will be considered.

@thibaudcolas At the moment there is no way to limit concurrency per stage, no. Also, @peterjc's comment in https://github.com/travis-ci/beta-features/issues/11#issuecomment-303233824 is correct. Does this answer your question? If it doesn't, could you please email support@travis-ci.org with your usecase and issue?

@arnaudbrejeon The linter indeed has not been updated with this, yet. It will be replaced with a new version soon that then will support this. Sorry about that.

@Pawamoy Thanks for the feedback, we hear you, and we are considering ways to improve this. In the meantime, have you considered using YAML aliases as a way to reduce verbosity? https://docs.travis-ci.com/user/build-stages/using-yaml-aliases/

@skeggse Thank you, see my response to @keradus above. We might consider this in a future version of the format, once we're able to version things better.

@ujovlado Thanks for the suggestion. Yes, conditional stages and jobs will be considered.

ljharb commented 7 years ago

@svenfuchs for example, i'd love to put each react version in a separate stage here, while still running each react version on each node version in the matrix; but i'd also want to run "every react version in the latest version of node" in a stage on its own first - ie, "react 15,14,13/node 7", then "all the other react 15s", then "all the other react 14s", then "all the other react 13s".

shepmaster commented 7 years ago

Could you let us know what conditions you are interested in? What is your usecase? I'd like to collect this information for our next round of evaluations.

@svenfuchs (for reference, here's my current .travis.yml) In my case, I have 3 stages:

  1. Build 3 docker containers in parallel
  2. Build 2 docker containers in parallel
  3. Build the frontend and backend code in parallel (when I figure out how to pass artifacts to another stage even for PRs, I'll actually run the tests in a 4th stage)

All 5 docker jobs start with code like:

if [[ ("${TRAVIS_PULL_REQUEST}" == "false") &&
          ("${TRAVIS_BRANCH}" == "master") &&
          (-n "${DOCKER_USERNAME}") &&
          (-n "${DOCKER_PASSWORD}") ]]

They basically do nothing when not building the master branch. A given PR thus spins up 5 machines, taking ~190 seconds of compute time, ~100 seconds of real time (not counting waiting in a queue).

roperto commented 7 years ago

I think it goes along with what other people are suggesting. It would be nice to have a different Matrix for pull requests.

The idea is:

Any thoughts on that?

Cheers,

Daniel

colby-swandale commented 7 years ago

@svenfuchs :wave: I'm working on using build stages for Bundler, is there a feature to specify the order of the build stages when using the env matrix? I have seen a few references in this post and in your example repos to specifying the order but i have not been able to find the relevant documentation or get this working in Travis. Thanks.

chrisguitarguy commented 7 years ago

I've got a deploy stage plus matrix expansion. The deploy is a github release.

This is a PHP project with tests running on 5.6 and 7.1 (preparing to migrate to 7.1).

The deploy stage errors because it tries to phpenv ['5.6', '7.1']

[5.6, 7.1] is not pre-installed; installing
7.1].tar.bz2: command not found
Downloading archive: 
0.01s$ curl -s -o archive.tar.bz2 $archive_url && tar xjf archive.tar.bz2 --directory /
curl: no URL specified!
curl: try 'curl --help' or 'curl --manual' for more information
0.01s
0.02s$ phpenv global ["5.6", "7.1"]
rbenv: version `[5.6,' not installed
The command "phpenv global ["5.6", "7.1"]" failed and exited with 1 during .
Your build has been stopped.

I'm guessing because the you're supposed to set the {language}: {version} in the deploy stage? That should be much more prominent in the docs if so. The only place it's mentioned is here in passing.

BanzaiMan commented 7 years ago

@chrisguitarguy We are always looking for your ideas to improve our documentation. How would you change the document so that this information can be prominently conveyed without breaking up the flow of the document?

chrisguitarguy commented 7 years ago

@BanzaiMan putting the note on this page would be a good.

Looks like the docs have already been updated to include that note as an aside/blockquote/whatever.

screen shot 2017-05-26 at 9 09 26 pm

I like it.