travis-ci / beta-features

The perfect place to leave feedback and comments on newly released Beta Features.
56 stars 68 forks source link

Build Stages: Flexible and practical Continuous Delivery pipelines #11

Closed joshk closed 7 years ago

joshk commented 7 years ago

From simple deployment pipelines, to complex testing groups, the world is your CI and CD oyster with Build Stages.

Build Stages allows you and your team to compose groups of Jobs which are only started once the previous Stage has finished.

You can mix Linux and Mac VMs together, or split them into different Stages. Since each Stage is configurable, there are endless Build pipeline possibilities!

This feature will be available for general beta testing soon... watch this space 😄

We love to hear feedback, it's the best way for us to improve and shape Travis CI. Please leave all thoughts/comments/ideas related to this feature here.

Happy Testing!

envygeeks commented 7 years ago

Try prefixing with pre @molovo

molovo commented 7 years ago

@envygeeks Sorry, prefixing what?

BanzaiMan commented 7 years ago

@molovo It is a shortcoming of the current implementation that the jobs defined in matrix expansion (in your case with node_js) happens at the beginning of the builds, and that they are assigned the stage name test. And the stage named test happens in the beginning of the build stages, so the behavior you are seeing is consistent with this.

This is not to say that this is desirable or expected; it is just that it is known.

If you want to accomplish what you desire, you'd need to define the jobs with jobs.include:

jobs:
  include:
    - stage: lint
      node: '8'
      ...
    - stage: test
      node: '6'
      ...
    - stage: test
      node: '7'
      ...
    - stage: test
      node: '8'
      ...
    - stage: coverage
      node: '8'
      ...
molovo commented 7 years ago

@BanzaiMan ah, it makes perfect sense when you explain it like that. Thanks

vb216 commented 7 years ago

I've used stages for each OS target we try to build. But, by default if one job in a stage fails, no further stages take place, which makes sense but its quite beneficial for us to pick up any failures in the later build stages too, as the whole build takes quite a few hours.

I marked every job as allow_failure which achieves the result of all items being built, but then the overall build is marked as a success (e.g. https://travis-ci.org/bytedeco/javacpp-presets/builds/243974354)

Is there a way to solve this case?

webknjaz commented 7 years ago

@vb216 you may add them all to the same build stage, but add some environment variable, so you'd easily recognise different envs. This is the only (yet ugly) way to achieve your needs. AFAICS current version of build stages is not designed to run independent stages in parallel.

vb216 commented 7 years ago

thanks @webknjaz , will go back to that method - i'd been using it before and does the job, just nicer to have the logical split into stages (and in theory could let me do some smarter per OS uploads at the end of each stage).

webknjaz commented 7 years ago

@BanzaiMan I've got another visual bug:

This build is running OS X stage jobs at the time I'm writing this message.

However, at the top of this web page it says #241 passed, text is colored with yellow, the image before it is green. Obviously Travis CI cannot know whether my build will succeed in advance.

My guess is that it might be related to fast_finish option working incorrectly.

keradus commented 7 years ago

https://github.com/travis-ci/beta-features/issues/11#issuecomment-305775668 (18 days ago):

Currently it is not possible to alter the build configuration itself based on the build request's properties (e.g., branch, tag, event type ("push" vs "pull request" vs "cron")). There is an internal ticket to track this common request, but we have no ETA.

In this particular case, it means that all build stages will be configured. And it is executed if the build proceeds to that stage.

@BanzaiMan , this would be really nice option. As separate stage for deployment is really great, using it is just wasting of CPU and time, as most of the time deployment job started, fetched, do what is needed (eg install apt of default, yarn if there is lock file and so on - which could not be control over .travis.yml file), and then realize that there will be no deployment execution due to conditions inside on. While I'm really great that this request has been already approved, are there any chances to at least know ETA ?

BanzaiMan commented 7 years ago

@keradus We still do not have ETA.

SuzanneSoy commented 7 years ago

When I'm logged in, if I go to https://travis-ci.org/ I do not see the name of the stages above the jobs. Simply clicking on "current" for the currently-running job correctly shows the stage names.

Screenshot of https://travis-ci.org/ :

The jobs appear one afther the other, without being split into "sections", one for each stage, as if staging was not used

Screenshot of https://travis-ci.org/jsmaniac/phc-thesis :

The jobs appear separated by the name of the stage they belong to

SuzanneSoy commented 7 years ago

The following trick can be used to preserve artifacts between stages: enable caching, and store the built artifacts inside the cached directory.

Don't forget to clear the cache directory at the end of the last stage, so that it does not waste space on Travis's infrastructure. You might want to also clear the cache directory at the beginning of the first stage, in case some stale data was not discarded, e.g. after a partially failed build.

It is possible to check in the second and subsequent stages that the cache was populated by the same build by storing "$TRAVIS_BUILD_ID" in the cache and then comparing the value.

language: c
sudo: false

cache:
  directories:
    - ~/mycache

jobs:
  include:
    - stage: phase1
      install:
      - mv ~/mycache ~/mycache-discard
      - mkdir ~/mycache
      - echo "$TRAVIS_BUILD_ID"
      - echo "$TRAVIS_BUILD_ID" > ~/mycache/travis_build_id
      - echo "build stuff" > ~/mycache/artifact.bin
      script:
      - true
    - stage: phase2
      install:
      - if test "$TRAVIS_BUILD_ID" != `cat ~/mycache/travis_build_id`; then travis_terminate 1; fi
      - rm -f ~/mycache/travis_build_id
      - echo "Do something with ~/mycache/artifact.bin, e.g. publish it to gh-pages"
      - mv ~/mycache ~/mycache-discard
      - mkdir ~/mycache # clear the cache
      script:
      - true

Note that the Travis documentation seems to hint that the cache is not updated as part of cron builds, and manually restarted builds. The trick should hopefully work for pull requests, though. Also, I'm not sure how this will behave when a stage contains multiple concurrent jobs which could in principle all populate the cache. Apparently, it is possible to explicitly specify the cache name with an environment variable CACHE_NAME=FOOBAR.

SkySkimmer commented 7 years ago

@jsmaniac jobs with different environments get different caches, as far as I can tell the CACHE_NAME trick is just to force this differentiation. In particular setting the same CACHE_NAME in 2 jobs that differ in some other environment variable does not allow them to share cache.

SuzanneSoy commented 7 years ago

@SkySkimmer Thanks, good to know!

The trick I mentioned above should therefore still be applicable by carefully giving the same environment to the two jobs which must communicate (and then overwrite the env vars if necessary within the job itself). This obviously won't work if the build and deploy phases use different languages or e.g. different versions of Ruby, though.

junosuarez commented 7 years ago

Echoing feedback from @2ndalpha and others above, I'd love an easy way to run a final "release" stage only after a merge to master. Currently I have a script that guards this, but it still adds 30-40 seconds to our run times while the job is queued, the VM boots up and installs env dependencies, etc, to each PR build.

pawamoy commented 7 years ago

In the last stage, it is not written which job was set to allow failure.

If the build has passed, then we know that failed jobs were set to allow failure, but if the build has failed, then we cannot make a distinction between jobs that allowed failure and those that didn't.

Could we have a distinct color for failed jobs allowing failure than red :smiley: ? Not yellow since it's taken for "running", but why not orange (if it is color-blindness friendly)? This could apply for builds not using stages also, as a visual indicator.

ELD commented 7 years ago

What's the state of multi-OS builds with build stages? Is it possible to run only one build stage on both Linux and macOS?

webknjaz commented 7 years ago

@ELD it's possible to achieve this using YAML anchors

ELD commented 7 years ago

@webknjaz: Could you show or point me to an example?

webknjaz commented 7 years ago

@ELD you'd still have to add lots of jobs.include entries, but anchors will help you have less code duplication: https://github.com/cherrypy/cheroot/blob/master/.travis.yml

ELD commented 7 years ago

@webknjaz Awesome, thanks!

bsideup commented 7 years ago

Would be nice to skip the stage with branch conditions. i.e. don't run "deploy" stage if it's a PR (it takes ages to allocate every stage, and then we do "exit" because branch condition doesn't pass)

seocam commented 7 years ago

Would be nice to skip the stage with branch conditions. i.e. don't run "deploy" stage if it's a PR (it takes ages to allocate every stage, and then we do "exit" because branch condition doesn't pass)

That would be really useful!

keradus commented 7 years ago

@bsideup , it is already approved feature, yet not estimated: https://github.com/travis-ci/beta-features/issues/11#issuecomment-309740722

leighmcculloch commented 7 years ago

I'm using build stages and they're great 🎉 . Thank you!

Now that TravisCI knows how the build is broken down would it be possible for TravisCI to report back to GitHub for each stage, instead of a single status for the entire build? This would give greater visibility in GitHub PRs on what has passed or failed without needing to click through.

The use case I see this useful is on this project where I have three build stages: analysis, tests, api-backwards-compatibility-check. It'd be great if the GitHub PR showed exactly which failed, and specifically that the api-backwards-compatibility-check which is allowed to failure failed on a PR. If tests and analysis passed but the build stage that checks api backwards compatibility failed which is allowed, we'd get visibility of that on PRs.

szpak commented 7 years ago

After the cancellation of all tests in the 2nd stage I would expect to have the following stage cancelled as well. However, the 2nd stage is normally executed in my project. Is it a known issue? Or is there anything wrong with my configuration?

travis-2nd-stage-not-cancelled

szpak commented 7 years ago

Having the after_success: skip command defined to skip it at the stage level it ends up with:

$ skip
No command 'skip' found, did you mean:
 Command 'sip' from package 'sip-dev' (main)
skip: command not found

true is an alternative, but as skip is mentioned in the documentation for script it could be supported also for other elements.

BanzaiMan commented 7 years ago

@szpak Since after_success is optional, after_success: skip doesn't make sense. Just omit it.

keradus commented 7 years ago

it is inherited from global scope if it's defined outside jobs definition

szpak commented 7 years ago

@BanzaiMan My case was predicted by @keradus. Having after_succes defined for a default "test" stage it needs to be overridden to disable it in other stages.

keradus commented 7 years ago

@szpak, you could define default test inside jobs (so not globally, thus no inheritance). eg https://github.com/FriendsOfPHP/PHP-CS-Fixer/blob/master/.travis.yml#L45

szpak commented 7 years ago

@keradus I didn't know that syntax. I will give it a try. Thanks.

webknjaz commented 7 years ago

@szpak @keradus I think I use after_succes: skip (also for overriding inherited value) in some of my repos and it works. Perhaps smth has broken lately.

alexfmpe commented 7 years ago

Been experimenting with stages, trying to get mac/linux builds. I ended up with something that looks like it works on travis, but stays endlessly queued. Also, hovering the apple/penguin icons gives a [object Object] tooltip, rather than the usual linux/osx. EDIT: It errored with an empty log after ~6 hours

A bunch of my travis.yaml with alias attempts weren't recognised by travis. I don't mean parsing errors, it simply didn't trigger a build.

webknjaz commented 7 years ago

@alexteves you have duplicate keys for os in each stage. Keys must be unique in mappings. You must create additional stage entries, so that each one would have one specific os entry.

alexfmpe commented 7 years ago

That's unfortunate, I was hoping to avoid combinatorial explosion.

Say, is it currently possible to specify linux dist inside stages? Trying it directly yielded multi-os stages again. I had to resort to matrix defaults like this to get the intended behavior, but if one wanted to specify two distros this way, it couldn't be done given that only one would be inherited right?

EDIT: turns out it's not using trusty as requested, but precise. Any way to choose distro in a stages build?

webknjaz commented 7 years ago

I think specifying root-level dist: trusty worked for me. But in certain conditions it's got resets to precise. Matrix expansion does not work inside of stages, it only generates test stage jobs list.

backspace commented 7 years ago

@jsmaniac, thanks for letting us know about the rendering bug on the root route. It has since been fixed and deployed, so build stages should properly render regardless of what route you’re on.

stianst commented 7 years ago

I'm trying out build stages for our project now and it's solving a lot of problems for us. It's really nice and I especially love how simple it is.

I've got two requests:

vb216 commented 7 years ago

@stianst is that approach to shared caches working for you for sure? I need to produce something similar, for a compiled job output that others are dependent on and populating the .m2 cache, would save a lot of time and complexity

I thought cache locations were built from a unique key derived from job variables, so as the env var you have is different they wouldn't share a cache?

stianst commented 7 years ago

@vb216 you're right it's not working. Would have been real nice if it did though, but for now I'm going back to building the whole thing for each jobs, which is annoying.

bkimminich commented 7 years ago

To get a build setup like this image it seems I need to have a .travis.yml like this:

language: node_js
node_js:
- 4
- 6
- 7
before_install:
- rm -rf node_modules
script:
- npm test
jobs:
  include:
    - stage: test e2e
      script: npm run e2e
      node_js: 4
    - stage: test e2e
      script: npm run e2e
      node_js: 6
    - stage: test e2e
      script: npm run e2e
      node_js: 7
    - stage: coverage
      script: npm run coverage
      node_js: 6
    - stage: deploy
      script: skip
      node_js: 6
      provider: npm
      email: XXXXXX
      api_key:
        secure: XXXXXX=
      on:
        tags: true
        repo: bkimminich/juice-shop-ctf
sudo: false

where it would be way nicer to have the test e2e stage declared like this:

    - stage: test e2e
      script: npm run e2e
      node_js:
      - 4
      - 6
      - 7
keradus commented 7 years ago

already raised @bkimminich at https://github.com/travis-ci/beta-features/issues/11#issuecomment-301875447 , check out the answer for it

rmehner commented 7 years ago

Hey there,

as tweeted here it would be super nice, if it would be possible to skip stages on certain branches. Something like this:

jobs:
  include:
    - stage: test
      rvm: 2.3.1
      script: bundle exec rspec
    - stage: test
      rvm: 2.4.1
      script: bundle exec rspec
    - stage: deploy
      rvm: 2.3.1
      branches:
        - master
        - production

My use case is, that I want to test if something breaks in the latest version of Ruby, while still keeping my main test suite in line with the version that is run in the respective production environment and only deploy with that version. However, the deploy stage takes a while to run & install and I don't need it to be run on any other branch than master or production.

I know there are workarounds, but I'd like the stages features to support that natively (I only want to deploy if all test stages are green)

keradus commented 7 years ago

functionality already requested and approved, yet no ETA: https://github.com/travis-ci/beta-features/issues/11#issuecomment-305775668

peshay commented 7 years ago

I have an issue with my travis syntax when I try to integrate that new feature

language: python
python:
- '2.7'
- '3.3'
- '3.4'
- '3.5'
- '3.6'
- 3.7-dev
- nightly
install:
- pip install -r requirements.txt
- python setup.py -q install

jobs:
  include:
    - stage: Tests
      script: nosetests -v --with-coverage
      after_success: codecov
    - stage: Releases
      before_deploy: tar -czf tpmstore-$TRAVIS_TAG.tar.gz tpmstore/*.py
      deploy:
        provider: releases
        api_key:
          secure: <long string>
        file: tpmstore-$TRAVIS_TAG.tar.gz
      on:
        repo: peshay/tpmstore
        branch: master
        tags: true
    - 
      deploy:
        - provider: pypi
          distributions: sdist
          user: peshay
          password:
            secure: <long string>
          on:
            branch: master
            tags: true
            condition: $TRAVIS_PYTHON_VERSION = "2.7"
        - provider: pypi
          distributions: sdist
          server: https://test.pypi.org/legacy/
          user: peshay
          password:
            secure: <long string>
          on:
            branch: master
            tags: false
            condition: $TRAVIS_PYTHON_VERSION = "2.7"
keradus commented 7 years ago

perhaps describing the issue you are facing would help

peshay commented 7 years ago

travis linter simply fails, I don‘t see why. unexpected key jobs, dropping

maciejtreder commented 7 years ago

Sharing files via stages, definetely should be done differently then via external systems. In the gitlab-ci it is done really simply, by 'artifacts' property in yml. Here should be same.

BanzaiMan commented 7 years ago

The linter is sadly out of date at the moment. Many of the recent keys are not recognized. We have plans to improve this aspect of our services, but it will be a little while.

Sharing storage has been raised as a missing feature many times before, and we recognize that it is critical.