hashicorp / ghaction-terraform-provider-release

Reusable GitHub Action Workflows for releasing HashiCorp, partner, and community Terraform Providers
Mozilla Public License 2.0
15 stars 9 forks source link

hashicorp: Consider Post-Promotion Registry API Version Checking #47

Closed bflad closed 8 months ago

bflad commented 1 year ago

Description

One challenge for HashiCorp provider releases at the moment is that there is a background task responsible for synchronizing releases.hashicorp.com product versions onto ingressing any new versions in the public Registry. This process is triggered on-demand with the hc-releases promotion command and there is a fallback hourly cron, but either or both of those can fail. There is no external methodology for checking those tasks. As such, even when everything is working as it should, there can be an indeterminate time between the provider release workflow completing and the provider version being actually accessible by practitioners.

Proposal

After the hc-releases promotion call, start a continual loop that checks the Registry API for the newly released provider version. On success, it should exit cleanly and cause success of the release workflow. On failure, it should continually keep trying. Eventually, it would trigger a GitHub Actions timeout, which will implicitly fail the workflow.

Locally, here is a quick shell script (curl and jq are installed on public runner images already):

while true; do
  if [ "$(curl --silent https://registry.terraform.io/v1/providers/hashicorp/cloudinit/versions | jq -r '.versions[] | select(.version == "2.3.3") | .version')" = "2.3.3" ]; then
    echo "$(date): hashicorp/cloudinit: Version 2.3.3 available in Registry"
    break
  else
    echo "$(date): hashicorp/cloudinit: Version 2.3.3 not available in Registry"
    sleep 15
  fi
done

Using a known good version, exits successfully:

Thu Mar  2 18:06:13 PST 2023: hashicorp/cloudinit: Version 2.3.2 available in Registry

Using a known bad version, loops forever:

Thu Mar  2 18:07:02 PST 2023: hashicorp/cloudinit: Version 2.3.3 not available in Registry
Thu Mar  2 18:07:20 PST 2023: hashicorp/cloudinit: Version 2.3.3 not available in Registry
...

With proper name and version substitution in a run step, this should hopefully do the trick:

jobs:
  Release:
    # ...
    steps:
      # ...
      -
        name: Promote
        uses: hashicorp/actions-hc-releases-promote@811b5f3b44787dbb5c3aa21d3fe8a4e844eb41ed # v1.0.0
        with:
          product-name: ${{ github.event.repository.name }}
          version: ${{ inputs.product-version }}
          hc-releases-host: ${{ secrets.hc-releases-host-prod }}
          hc-releases-key: ${{ secrets.hc-releases-key-prod }}
          hc-releases-source_env_key: ${{ secrets.hc-releases-key-staging }}
          hc-releases-terraform-registry-sync-token: ${{ secrets.hc-releases-terraform-registry-sync-token }}
      -
        name: Wait for Registry availability
        run: |
          REGISTRY_NAME="$(echo -n "${{ github.event.repository.name }}" | sed 's,terraform-provider-,,')"
          REGISTRY_VERSION="$(echo -n "${{ inputs.product-version }}" | sed 's,^v,,')"
          while true; do
            if [ "$(curl --silent https://registry.terraform.io/v1/providers/${REGISTRY_NAME}/versions | jq -r ".versions[] | select(.version == \"${REGISTRY_VERSION}\") | .version')" = "${REGISTRY_VERSION}" ]; then
              echo "$(date): ${REGISTRY_NAME}: Version ${REGISTRY_VERSION} available in Registry"
              break
            else
              echo "$(date): ${REGISTRY_NAME}: Version ${REGISTRY_VERSION} not available in Registry"
              sleep 15
            fi
          done

For future consideration, these (or any) release process failures could be funneled to a shared Slack channel for increased visibility.

bflad commented 8 months ago

Automatically doing this was considered undesirable for some developers, so closing this out. Anyone interested in this sort of checking can implement something similar in their calling workflows.

github-actions[bot] commented 4 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.