This repository is not currently maintained. We encourage you to explore it, fork it, or otherwise use it as inspiration for your own metrics instrumentation.
Through six years of research, the DevOps Research and Assessment (DORA) team has identified four key metrics that indicate the performance of software delivery. Four Keys allows you to collect data from your development environment (such as GitHub or GitLab) and compiles it into a dashboard displaying these key metrics.
These four key metrics are:
Use Four Keys if:
Four Keys works well with projects that have deployments. Projects with releases and no deployments, for example, libraries, do not work well because of how GitHub and GitLab present their data about releases.
For a quick baseline of your team's software delivery performance, you can also use the DORA DevOps Quick Check. The quick check also suggests DevOps capabilities you can work on to improve your performance. The Four Keys project itself can help you improve the following DevOps capabilities:
This diagram shows the design of the Four Keys system:
bq-workers/
dashboard/
data-generator/
event-handler/
event-handler
, which is the public service that accepts incoming webhooks. queries/
setup/
shared/
bq-workers
terraform/
The project uses Python 3 and supports data extraction for Cloud Build and GitHub events.
NOTE: Make sure you don't use "Squash Merging" in Git when merging back into trunk. This breaks the link between the commit into trunk and the commits from the branch you developed on and as thus it is not possible to measure "Time to Change" on these commits. It is possible to disable this feature in the settings of your repo
The setup script includes an option to generate mock data. Generate mock data to play with and test the Four Keys project.
The data generator creates mocked GitHub events, which are ingested into the table with the source “githubmock.” It creates following events:
To run outside of the setup script:
Ensure that you’ve saved your webhook URL and secret in your environment variables:
export WEBHOOK={your event handler URL}
export SECRET={your event-handler secret}
Run the following command:
python3 data-generator/generate_data.py --vc_system=github
You can see these events being run through the pipeline:
You can query the events_raw
table directly in BigQuery:
SELECT * FROM four_keys.events_raw WHERE source = 'githubmock';
The scripts consider some events to be “changes”, “deploys”, and “incidents.” You may want to reclassify one or more of these events, for example, if you want to use a label for your incidents other than “incident.” To reclassify one of the events in the table, no changes are required on the architecture or code of the project.
Update the view in BigQuery for the following tables:
four_keys.changes
four_keys.deployments
four_keys.incidents
To update the view, we recommend that you update the sql
files in the queries
folder, rather than in the BigQuery UI.
Once you've edited the SQL, run terraform apply
to update the view that populates the table:
cd ./setup && terraform apply
Notes:
changes
, deployments
, incidents
. To add other event sources:
AUTHORIZED_SOURCES
in sources.py
.
new_source.sh
script in the setup
directory. This script creates a Pub/Sub topic, a Pub/Sub subscription, and the new service using the new_source_template
.
main.py
in the new service to parse the data properly.If you add a common data source, please submit a pull request so that others may benefit from the functionality.
This project uses nox to manage tests. The noxfile
defines what tests run on the project. It’s set up to run all the pytest
files in all the directories, as well as run a linter on all directories.
To run nox:
Ensure that nox is installed:
pip install nox
Use the following command to run nox:
python3 -m nox
To list all the test sessions in the noxfile, use the following command:
python3 -m nox -l
Once you have the list of test sessions, you can run a specific session with:
python3 -m nox -s "{name_of_session}"
The "name_of_session" will be something like "py-3.6(folder='.....').
four_keys.events_raw
Field Name | Type | Notes |
source | STRING | eg: github |
event_type | STRING | eg: push |
id* | STRING | Id of the development object. Eg, bug id, commit id, PR id |
metadata | JSON | Body of the event |
time_created | TIMESTAMP | The time the event was created |
signature | STRING | Encrypted signature key from the event. This will be the unique key for the table. |
msg_id | STRING | Message id from Pub/Sub |
*indicates that the ID is generated by the original system, such as GitHub.
This table will be used to create the following three derived tables:
four_keys.deployments
Note: Deployments and changes have a many to one relationship. Table only contains successful deployments.
Field Name | Type | Notes |
🔑deploy_id | string | Id of the deployment - foreign key to id in events_raw |
changes | array of strings | List of id’s associated with the deployment. Eg: commit_id’s, bug_id’s, etc. |
time_created | timestamp | Time the deployment was completed |
four_keys.changes
Field Name | Type | Notes |
🔑change_id | string | Id of the change - foreign key to id in events_raw |
time_created | timestamp | Time_created from events_raw |
change_type | string | The event type |
four_keys.incidents
Field Name | Type | Notes |
🔑incident_id | string | Id of the failure incident |
changes | array of strings | List of deployment ID’s that caused the failure |
time_created | timestamp | Min timestamp from changes |
time_resolved | timestamp | Time the incident was resolved |
The dashboard displays all four metrics with daily systems data, as well as a current snapshot of the last 90 days. The key metric definitions and description of the color coding are below.
For a deeper understanding of the metrics and intent of the dashboard, see the 2019 State of DevOps Report.
For details about how Four Keys calculates each metric in this dashboard, see the Four Keys Metrics calculation doc.
This Four Keys project defines the key metrics as follows:
Deployment Frequency
Lead Time for Changes
Time to Restore Services
Change Failure Rate
For more information on the calculation of the metrics, see the METRICS.md
The dashboard has color coding to show the performance of each metric. Green is strong performance, yellow is moderate performance, and red is poor performance. Below is the description of the data that corresponds to the color for each metric.
The data ranges used for this color coding roughly follows the ranges for elite, high, medium, and low performers that are described in the 2019 State of DevOps Report.
Deployment Frequency
Lead Time to Change
Time to Restore Service
Change Failure Rate
The following chart is from the 2019 State of DevOps Report, and shows the ranges of each key metric for the different category of performers.
Disclaimer: This is not an officially supported Google product