Spiking Out a New Type of CI

leefaus commented 8 years ago

Why

Continuous Integration was originally established in 1991 as part of the XP (eXtreme Programming) Process for building applications. As build servers became all of the rage, Hudson was developed by Sun's Kohsuke Kawaguchi as a product to automate the CI process outside of the developers machine. This process was a polling based system to work off the main line code branches as code was committed to the repository. Today we see CI being an integral part of the development and deployment best practices inside many organizations. canvas 1 As source code control systems have matured, the CI process has not. Many products being used today have a larger code base than the projects built back in the late 1990's. We have also introduced DVCS as way to allow developers to experiment and collaborate on changesets off the mainline code base which provides an opportunity for CI to evolve. canvas 2

How

Many DVCS systems now support something called Webhooks that will notify another system about changes in the existing system. This allows CI to be proactive in the build process by getting notified that new code has been added to the repository instead of polling the system for changes. Also, DVCS systems work with changesets (diffs) of the code from the mainline, instead of the full codebase in it's current state. This allows a CI system to build a changeset before a merge into the mainline takes place, therefore allowing a review to take place to ensure the code should be merged. canvas 3 With the Webhook notification and callback system in place in these DVCS systems, we can incrementally perform independent build processes and retrieve individual statuses for each one. In the current CI process, this is normally an all or nothing approach requiring a developer to log into the CI server to view the build logs and determining where the error may have occurred. Imagine having your review process setup to analyze the code for technical debt as a first step. If this code has too many additions for this changeset, you could provide a failed status without going into a lengthy build process. The same could be said for an organization's development best practices. Based on coding best practices, you could analyze the code via linting and provide feedback to the coder on on the changes adherence to company or organizational policies.
Because of this fundamental change in process, we recommend changing the term integration, because we are working on a changeset and not the integration to the mainline, and move to reviews. In GitHub specifically, we could do reviews as part of the Pull Request process. We now have an automated code review process that eliminates standing weekly reviews as part of an Agile development process. Developers, team leads, and project managers will always have visibility into the quality of a pull request (set of changes) before deciding on moving these changes into the main line (master) branch. Now a merge takes place into the mainline so we should think about traditional CI processes here. We know that we don't need to redo the quality aspects and we only need to test the integration of the new code against the changes that others have made. This normally requires teams to create multiple build processes in a traditional CI setup. canvas 4 Should the status of this integration be successful, we can tag the main line branch, create a formal release, and publish this release to a production infrastructure.

What

Building out SquirrelCI is new breed of CI server. It should have inbound endpoints for the different review steps:

Static Code Analysis
Coding Standards and Best Practices
Technical Debt Evaluation
Code Coverage
Build based on 12 factor application best practices
Others...

These steps can be defined in a custom branch in the DVCS system that lists out the steps in which you want the workflow to execute.

1.static-code-analysis.yml
2.code-coverage.yml
3.technical-debt.yml

The CI server will be knowledgeable of this branch and clone it as part of the review process. The CI server will use a work scheduler to run through each of the steps independently and provide a status back to the DVCS server on a success, failure, error criteria. Should the steps integrate with a third party product, the steps to execute the third party product will be entered into the file that describes the process.

execute: git clone <<repository>> review
execute: cd review
execute: checkmarx analyze-security .

All of the scheduled steps will be executed in a container like Docker. For full stack applications, we would also store things like a Dockerfile or puppet scripts in this branch that describes how to stand-up an integrated application stack for testing and review. You would link this file in the yml file based on relative pathing so it could be easily found.
Should the last step include a *.deployment.yml file, the CI server would use a rules server to evaluate the criteria needed to actually deploy the application to a testing environment. If the success criteria is not met, then this step would be skipped. Scenarios for this might be not working off the main line branch, technical debt failed, or the code does not meet organizational standards. If a deployment off the main line branch is done, a tag and release is created. You could even go as far as taking the binary deliverables and pushing them back to the DVCS with commands like git lfs.

This is a spike of thoughts I have seen working with development teams and automation for code reviews and releases. This process is not unique to web and full stack applications. You could also use these steps for building mobile applications and other products.

chrisalong commented 8 years ago

:+1:

ghost commented 8 years ago

Very good article. You really identify what needs to happen with continuous development in order to move forward more successfully as software practices and tools evolve. More companies should be focusing on this.

matthewmccullough commented 8 years ago

Very clear write-up on the state and future of CI. This is the type of artifact that:

Would provide benefit to the Sales and Sales Engineering team in their repo as an artifact that the entire department could use in conversing with customers.
Could be made into a short deck that the department could use in 2nd-call pitches about the benefit of the GitHub platform and integrations compared to busy polling.
Could fit as part of the standard SE demo to offer a complete picture of the value of the GitHub platform.

I'd recommend coordination with @chrisalong and @johnagan to bring this in to the SE repo and find cases where we could start using it.

gitaboard / squirrelci

Spiking Out a New Type of CI #4

Why

How

What