abcxyz / github-metrics-aggregator

Apache License 2.0
11 stars 3 forks source link

GitHub Metrics Aggregator

GitHub Metrics Aggregator (GMA) is a GitHub app that ingests events from the GitHub API and creates dashboards about velocity and productivity.

It is made up of two components, webhook service and retry service. The webhook service ingests GitHub webhook event payloads. This service will post all requests to a PubSub topic for ingestion and aggregation into BigQuery. The retry service will run on a configurable cadence and redeliver events that failed to process by the webhook service.

Architecture

"Architecture"

Setup

What to expect

We recommend using the abc CLI to render templates for setting up GMA. The setup is split into three parts:

  1. Provision infrastructure with Terraform
  2. Add secret values via Secret Manager
  3. Build and deploy the service with GitHub workflows
.github/
  workflows/
    deploy-github-metrics.yaml
github-metrics/
  infra/
    main.tf
    outputs.tf
    terraform.tf
  deployments/
    Dockerfile
    deploy.sh

Pre-requisites

Create a GitHub App

Follow the directions from these GitHub instructions. Uncheck everything and provide all required fields that remain. Make sure to uncheck the Active checkbox within the Webhook section so you don't have to supply a webhook yet, it will be created when you deploy the Terraform module in the next section. Create a private key and download it for an upcoming step. Once the GitHub App is created, take note of the GitHub App ID.

Grant GitHub App permissions

Grant any of the following permissions (or more) according to your requirements:

Provision the infrastructure

Run the following command after replacing the input values.

abc templates render \
  -input=custom_name=GMA-CUSTOM-NAME \
  -input=project_id=GMA-PROJECT-ID \
  -input=automation_service_account_email=CI-SERVICE-ACCOUNT \
  -input=domain=GMA-DOMAIN \
  -input=terraform_state_bucket=TERRAFORM-BUCKET-NAME \
  -input=github_app_id=GMA-GITHUB-APP-ID \
  github.com/abcxyz/github-metrics-aggregator/abc.templates/infra@v0.0.24

This should render the following Terraform files.

GMA-CUSTOM-NAME/
  infra/
    main.tf
    outputs.tf
    terraform.tf

Run Terraform init:

terraform -chdir=GMA-CUSTOM-NAME/infra init -backend=false

Then apply the Terraform. Take note of the following values from the generated output:

Update DNS

Create an A record pointing your custom domain to the gclb_external_ip_address.

Update the GitHub App

In the GitHub App settings,

  1. Check the Active checkbox in the Webhook section
  2. Set the webhook URL: https://GMA-DOMAIN/webhook
  3. Save changes

Add Secret Values to Secret Manager

Create webhook secret

Run the following command to generate a random string to be use for the Github Webhook secret

openssl rand -base64 32

Save this value for the "Upload the secrets" step.

The Terraform module will have created a Secret Manager secret in the project provided with the name github-webhook-secret.

Create Github private key secret

The Terraform module will have created a Secret Manager secret in the project provided with the name github-private-key.

In the GitHub App settings, under the Private Keys section,

  1. Click Generate a private key. This will add a .pem file to your Downloads. The contents of the .pem file is the private_key, including the BEGIN and END. It should look something like,

    -----BEGIN RSA PRIVATE KEY-----
    SOME-SUPER-SECRET-
    SHHHHHHHHHHHHHH-
    KEEP-THIS-A-SECRET
    -----END RSA PRIVATE KEY-----

Copy the contents of the file:

cat location/to/private/key.private-key.pem | pbcopy

Upload the secrets

Navigate to the Google Cloud dashboard for Secret Manager and add a new revision with the generated values to their corresponding secret ID's.

Deploy the service

NOTE: Before going through the following steps, ensure that your GMA Cloud Run service agent can read from your GAR repository. The service agent is in the form of service-GMA-PROJECT-NUM@serverless-robot-prod.iam.gserviceaccount.com

Run the following command after replacing the input values.

abc templates render \
  -input=wif_provider=CI-WIF-PROVIDER \
  -input=wif_service_account=CI-SERVICE-ACCOUNT \
  -input=project_id=GMA-PROJECT-ID \
  -input=full_image_name=us-docker.pkg.dev/GAR-PROJECT-ID/GAR-REPOSITORY/gma-server \
  -input=region=REGION \
  -input=webhook_service_name=GMA-WEBHOOK-SERVICE-NAME \
  -input=retry_service_name=GMA-RETRY-SERVICE-NAME \
  -input=custom_name=GMA-CUSTOM-NAME \
  github.com/abcxyz/github-metrics-aggregator/abc.templates/deployments@v0.0.24

This should generate the following files:

.github/
  workflows/
    deploy-GMA-CUSTOM-NAME.yaml
GMA-CUSTOM-NAME/
  deployments/
    Dockerfile
    deploy.sh

Merge these files into the main branch of your repository. This should trigger the deploy-GMA-CUSTOM-NAME.yaml workflow to build and upload your GMA image to GAR and then deploy to the Cloud Run services.

You can alternatively manually run the workflow, if necessary.

Looker Studio

Template Dashboard

abcxyz provides a template Looker Studio Dashboard. To utilize this, add the following config in the GMA-CUSTOM-NAME/infra/main.tf file.

module "GMA_CUSTOM_NAME" {
  # ...hidden properties
  # ...

  github_metrics_dashboard = {
      enabled = true # set this to true (defaults to false)
      viewers = [] # add viewers, such as "group:<group-email>",
  }
}

After applying these changes with Terraform, copy the value of github_metrics_looker_studio_report_link from the output values and navigate to the link in your browser.

This will give you a preview of the dashboard. On the top right, click Edit and Share. Verify the data, then proceed to save. This will complete the process to link your datasource to the Looker Studio report template.

Custom Dashboard

To make use of the events data, it is recommended to create views per event. This allows you to create Looker Studio data sources per event that can be used in dashboard.

Example


SELECT
  received,
  event,
  JSON_VALUE(payload, "$.organization.login") owner,
  JSON_VALUE(payload, "$.organization.id") owner_id,
  JSON_VALUE(payload, "$.repository.name") repo,
  JSON_VALUE(payload, "$.repository.id") repo_id,
  JSON_VALUE(payload, "$.repository.full_name") repo_full_name,
  JSON_VALUE(payload, "$.repository.visibility") repo_visibility,
  JSON_VALUE(payload, "$.sender.login") sender,
  JSON_VALUE(payload, "$.action") action,
  JSON_VALUE(payload, "$.pull_request.id") id,
  JSON_VALUE(payload, "$.pull_request.title") title,
  JSON_VALUE(payload, "$.pull_request.state") state,
  JSON_VALUE(payload, "$.pull_request.url") url,
  JSON_VALUE(payload, "$.pull_request.html_url") html_url,
  JSON_VALUE(payload, "$.pull_request.base.ref") base_ref,
  JSON_VALUE(payload, "$.pull_request.head.ref") head_ref,
  JSON_VALUE(payload, "$.pull_request.user.login") author,
  JSON_VALUE(payload, "$.pull_request.user.id") author_id,
  TIMESTAMP(JSON_VALUE(payload, "$.pull_request.created_at")) created_at,
  TIMESTAMP(JSON_VALUE(payload, "$.pull_request.closed_at")) closed_at,
  JSON_VALUE(payload, "$.pull_request.merged") merged,
  JSON_VALUE(payload, "$.pull_request.merge_commit_sha") merge_commit,
  TIMESTAMP(JSON_VALUE(payload, "$.pull_request.merged_at")) merged_at,
  TIMESTAMP(JSON_VALUE(payload, "$.pull_request.merged_by")) merged_by,
  TIMESTAMP_DIFF(TIMESTAMP(JSON_VALUE(payload, "$.pull_request.closed_at")), TIMESTAMP(JSON_VALUE(payload, "$.pull_request.created_at")), SECOND) open_duration_s,
  PARSE_JSON(payload) payload
FROM
  `YOUR_PROJECT_ID.github_webhook.events`
WHERE
  event = "pull_request";

Environment Variables

Webhook Service

Retry Service

Testing Locally

Creating GitHub HMAC Signature

echo -n `cat testdata/issues.json` | openssl sha256 -hmac "test-secret"

# Output:
08a88fe31f89ab81a944e51e51f55ebf9733cb958dd83276040fd496e5be396a

Use this value in the X-Hub-Signature-256 request header as follows:

X-Hub-Signature-256: sha256=08a88fe31f89ab81a944e51e51f55ebf9733cb958dd83276040fd496e5be396a

Example Request

PAYLOAD=$(echo -n `cat testdata/issues.json`)
GITHUB_WEBHOOK_SECRET="test-secret"

curl \
  -H "Content-Type: application/json" \
  -H "X-Github-Delivery: $(uuidgen)" \
  -H "X-Github-Event: issues" \
  -H "X-Hub-Signature-256: sha256=$(echo -n $PAYLOAD | openssl sha256 -hmac $GITHUB_WEBHOOK_SECRET)" \
  -d $PAYLOAD \
  http://localhost:8080/webhook

# Output
Ok