coder / envbuilder

Build development environments from a Dockerfile on Docker, Kubernetes, and OpenShift. Enable developers to modify their development environment quickly.
Apache License 2.0
157 stars 28 forks source link

issue pushing image: BLOB_UNKNOWN: Manifest references unknown blob(s) #385

Open bpmct opened 1 month ago

bpmct commented 1 month ago

envbuilder image: v1.0.3

I'm using Envbuilder and just saw this error when attempting to push an image with this template. This is the first time I tried it. Any ideas?

#2: ENTRYPOINT ["/.envbuilder/bin/envbuilder"]
#2: Taking snapshot of files...
#2: Pushing layer us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood:54caa3ffffdf4ccf8167b88fb35d86690df3c33b209f38144d0c1a568e129d69 to cache now
#2: Pushing image to us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood:54caa3ffffdf4ccf8167b88fb35d86690df3c33b209f38144d0c1a568e129d69
#2: Pushed us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood@sha256:38521b4520703bca0f790759b278cf7897e4b855d910c5fc537a5831fbcaac4b
#2: Pushed us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood@sha256:21095fa572a3c6de7369ceeb83c9a4ee8481a95652af28c5e008e83608510204
#2: 🏗️ Built image! [1m57.651985166s]
#3: 🏗️ Pushing image...
#3: Pushing image to us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood
restore mount /usr/src/linux-headers-5.15.0-97-generic
restore mount /usr/src/linux-hwe-5.15-headers-5.15.0-97
restore mount /lib/modules/5.15.0-97-generic
Restored DOCKER_CONFIG to 
error: do push: failed to push to destination us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood: PUT https://us-central1-docker.pkg.dev/v2/coder-dogfood-v2/envbuilder-cache/coder-dogfood/manifests/latest: BLOB_UNKNOWN: Manifest references unknown blob(s): sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
johnstcn commented 1 month ago

cc @mafredri

johnstcn commented 1 month ago

I wonder if one of the blobs we previously pushed has been deleted due to age? Need to figure out how to reliably reproduce.

bpmct commented 1 month ago

related: i'm not sure if a failed push should prevent the user from entering their workspace. can this fail gracefully while also alerting devs/admins of the problem?

johnstcn commented 1 month ago

related: i'm not sure if a failed push should prevent the user from entering their workspace

It definitely shouldn't!

EDIT: Actually, there may situations where it's warranted. If you plan to run envbuilder in a CI/CD workflow, then you would want to know that pushing the image failed. We would need to be able to differentiate if pushing the image is a requirement or if pushing the image is a nice side-effect.

MartinLoeper commented 4 weeks ago

I am also experiencing a similar issue: error: do push: failed to push to destination harbor.harbor.svc.cluster.local/envbuilder-cache/test-repo2: PUT https://harbor.harbor.svc.cluster.local/v2/envbuilder-cache/test-repo2/manifests/latest: MANIFEST_BLOB_UNKNOWN: blob unknown to registry; sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1

I wonder why this is happening with harbor. There is definitely no GC going on in my default harbor setup. Has any of you an idea what might be going on here?

johnstcn commented 4 weeks ago

I can reproduce this locally as well with registry:2 as the registry with the same Docker image as Ben above (mcr.microsoft.com/vscode/devcontainers/universal:focal).

I can also reproduce this issue with codercom/code-server:latest.

Strangely enough, I can see the blob on the registry filesystem before pushing, and it's just an empty file:

sudo zcat .registry-cache/docker/registry/v2/blobs/sha256/4f/4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1/data | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
johnstcn commented 4 weeks ago

Querying the registry v2 HTTP API locally, I can see that it is indeed missing despite the layers being on disk:

curl -I http://localhost:5000/v2/cache/blobs/sha256%3A4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1

would return a HTTP 404 Not Found response.

After manually uploading the blob via cURL:

curl -v -XPOST http://localhost:5000/v2/cache/blobs/uploads/ 2>&1 | grep Location:
curl -v -XPUT 'http://localhost:5000/v2/cache/blobs/uploads/<uuid from above>?_state=<state from above>&digest=sha256%3A4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1' --upload-file .registry-cache/docker/registry/v2/blobs/sha256/4f/4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1/data

the previous HEAD request returned a 200 OK response.

Based on the above, it looks like we're just failing to upload that particular blob. Looking further.

MartinLoeper commented 3 days ago

Hey guys, I am waiting for this PR because I could not make caching work in envbuilder test setup for coder. What is the current state of this issue? Is it a known bug with a solution in the making or are there only workarounds like the force flag?

mafredri commented 1 day ago

Hey @MartinLoeper, sorry you're running into this issue. Are you looking to cache the built image (can be used via docker run ...) or are you only looking to produce and use build cache?

If it's the latter, you can simply use ENVBUILDER_PUSH_IMAGE=0. If it's the former, we are still investigating the root cause and are aiming to resolve it next week. The issue seems to be related to us enabling force push metadata in order to do cache probing, however, Kaniko skips over certain layers not necessary for the build so they're never downloaded and can't be pushed. But even when we force Kaniko to download all layers, you'll still run into this issue -- that's the part we're still figuring out.

In the meantime, I think I have a jank workaround for you. It'd be nice if you could try it out and confirm whether or not it works.

Given the following:

The workaround is the following:

docker pull codercom/code-server:latest
docker tag codercom/code-server:latest localhost:5000/cache:latest
docker push localhost:5000/cache:latest

This will populate the missing blobs in the registry and allow the envbuilder build to succeed.