vmware-labs / distribution-tooling-for-helm

Helm Distribution plugin is is a set of utilities and Helm Plugin for making offline work with Helm Charts easier. It is meant to be used for creating reproducible and relocatable packages for Helm Charts that can be moved around registries without hassles. This is particularly useful for distributing Helm Charts into airgapped environments.
Apache License 2.0
74 stars 12 forks source link

transient unwrap error using multi-arch images #88

Closed migmartri closed 2 months ago

migmartri commented 2 months ago

Describe the bug

Since we added multi-arch images, we've noticed transient errors during unwrap.

As you can see in the image below, the images were pushed but the Images.lock verification complained.

image

Retrying the process seems to make it work.

We are using GitHub as origin as Azure Container Registry as destination

Reproduction steps

  1. Unwrap a Chart that contains multi-arch images

Expected behavior

Not to fail

Additional context

No response

juan131 commented 2 months ago

@migmartri I'll take a look into this. A couple of questions:

migmartri commented 2 months ago

Hi @juan131, thanks for getting back to me

@migmartri I'll take a look into this. A couple of questions:

  • Are you able to reproduce it in other container registries or only in Azure?

So far we've only ran the tool against Azure, so I can't say if this is an issue with other registries.

  • What are the affected charts you mentioned?

We use the Chainloop chart plus a proprietary chart. This is an example of how we unwrap

helm dt unwrap chainloop*.wrap.tgz oci://chainloop.azurecr.io \
             --push-chart-url oci://chainloop.azurecr.io/chart --yes

Some interesting bits about the behavior

 crane manifest ghcr.io/chainloop-dev/chainloop/control-plane:v0.96.7
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1581,
         "digest": "sha256:933398f3b58ae1cf24a4348f8cb43af1feb13b86944dc63f2163522831503482",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 1581,
         "digest": "sha256:8b6da80d4c6ac2158f4ca80c91cd6c52a43bb5cebefbee7f45a3ebd56b0c8727",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      }
   ]
}

My guess is that it takes some time for Azure Registry to propagate, note that we have enabled geo-located replication. This can be probably fixed by adding a retry mechanism in that check.

migmartri commented 2 months ago

Hi @juan131

I might have found the source of the issue and it might not be in the tool itself. I think that the problem was that we had another pipeline re-pushing the images and overriding the manifest index. So I'll close this issue. Thanks for your support.

BTW, the reason we had another pipeline is to make sure we tag our images with latest as explained in this other issue https://github.com/vmware-labs/distribution-tooling-for-helm/issues/81

juan131 commented 2 months ago

Thanks so much for the update @migmartri