[Bug] ImagePullBackoff when pulling an image from private ECR with SOCI Index being present.

awslabs / soci-snapshotter

A containerd snapshotter plugin which enables standard OCI images to be lazily loaded without requiring a build-time conversion step.

Apache License 2.0

514 stars 54 forks source link

[Bug] ImagePullBackoff when pulling an image from private ECR with SOCI Index being present. #1084

Open debajyoti-truefoundry opened 7 months ago

debajyoti-truefoundry commented 7 months ago

Description

I have set up CRI-based authentication to make Soci Snapshotter work with our private ECR Registry. We use EKS.

I noticed that one of the containers was stuck in the ImagePullBackoff state. This was the event.

Failed to pull image "111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77": [failed to pull and unpack image "111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77": failed to resolve reference "111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77": unexpected status from HEAD request to https://111111111.dkr.ecr.eu-west-1.amazonaws.com/v2/tfy-truefoundry-billing-server-1egxs/manifests/75159d954d0b3aed9dfc9ca7df2654b5220f174f-77: 403 Forbidden, rpc error: code = Unavailable desc = failed to pull and unpack image "111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77": failed to prepare extraction snapshot "extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0": layer "551" unavailable: unavailable]

These are the logs from the Containerd, Soci snapshotter and Kubelet.

[root@ip-10-2-45-158 /]# journalctl -u containerd | grep "extract-639718280-mVNW"
Feb 22 08:40:19 ip-10-2-45-158.eu-west-1.compute.internal containerd[3145]: time="2024-02-22T08:40:19.668913836Z" level=error msg="PullImage \"111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77\" failed" error="rpc error: code = Unavailable desc = failed to pull and unpack image \"111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77\": failed to prepare extraction snapshot \"extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0\": layer \"551\" unavailable: unavailable"

[root@ip-10-2-45-158 /]# journalctl -u soci-snapshotter | grep "extract-639718280-mVNW"
Feb 22 08:40:19 ip-10-2-45-158.eu-west-1.compute.internal soci-snapshotter-grpc[2797]: {"key":"k8s.io/620/extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0","level":"info","msg":"preparing filesystem mount at mountpoint=/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/551/fs","parent":"sha256:649fd1aff815275641e05b690d7ddb2755b6d9ae6da2c6c32baac61709e4b647","time":"2024-02-22T08:40:19.642853068Z"}
Feb 22 08:40:19 ip-10-2-45-158.eu-west-1.compute.internal soci-snapshotter-grpc[2797]: {"image":"111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77","key":"k8s.io/620/extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0","layerDigest":"sha256:cf35370accea37c1d01c15961f837154edafcf047ec934bacc449aa497ea0282","level":"info","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/551/fs","msg":"skipping mounting layer as FUSE mount: no ztoc for layer","parent":"sha256:649fd1aff815275641e05b690d7ddb2755b6d9ae6da2c6c32baac61709e4b647","time":"2024-02-22T08:40:19.642922896Z"}
Feb 22 08:40:19 ip-10-2-45-158.eu-west-1.compute.internal soci-snapshotter-grpc[2797]: {"error":"skipping mounting layer sha256:cf35370accea37c1d01c15961f837154edafcf047ec934bacc449aa497ea0282 as FUSE mount: no ztoc for layer","key":"k8s.io/620/extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0","level":"warning","msg":"failed to prepare remote snapshot","parent":"sha256:649fd1aff815275641e05b690d7ddb2755b6d9ae6da2c6c32baac61709e4b647","remote-snapshot-prepared":"false","time":"2024-02-22T08:40:19.642959154Z"}

[root@ip-10-2-45-158 /]# journalctl -u kubelet | grep "extract-639718280-mVNW"
Feb 22 08:40:19 ip-10-2-45-158.eu-west-1.compute.internal kubelet[3287]: E0222 08:40:19.669367    3287 remote_image.go:180] "PullImage from image service failed" err="rpc error: code = Unavailable desc = failed to pull and unpack image \"111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77\": failed to prepare extraction snapshot \"extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0\": layer \"551\" unavailable: unavailable" image="111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-billing-server-1egxs:75159d954d0b3aed9dfc9ca7df2654b5220f174f-77"

I restarted the Pod and got the same error again.

I drained the Node where the Pod was scheduled to move it to another Node.

Surprisingly it worked this time. The new Node, where the Pod got scheduled, also had Soci Snapshotter.

Steps to reproduce the bug

No response

Describe the results you expected

No ImagePullBackoff

Host information

OS: AL 2
Snapshotter Version: soci-snapshotter-grpc version v0.5.0 77f218d0b11c88d17a832ad7456dbd2165d19c86
Containerd Version: containerd github.com/containerd/containerd 1.7.11 64b8a811b07ba6288238eefc14d898ee0b5b99ba

Any additional context or information about the bug

No response

debajyoti-truefoundry commented 7 months ago

I noticed similar issues when using Docker Hub as a registry. In this case, I did not have the SOCI Index for the image present in the repository.

Pod Event:
Failed to pull image "docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18": rpc error: code = Unavailable desc = failed to pull and unpack image "docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18": failed to prepare extraction snapshot "extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378": layer "3496" unavailable: unavailable
Error: ErrImagePull

[root@ip-10-2-17-73 /]# journalctl -u kubelet | grep "extract-779240847-Oy6"
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal kubelet[3842]: E0222 06:46:17.804623    3842 remote_image.go:180] "PullImage from image service failed" err="rpc error: code = Unavailable desc = failed to pull and unpack image \"docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18\": failed to prepare extraction snapshot \"extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378\": layer \"3496\" unavailable: unavailable" image="docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18"

[root@ip-10-2-17-73 /]# journalctl -u containerd | grep "extract-779240847-Oy6"
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal containerd[3677]: time="2024-02-22T06:46:17.804322115Z" level=error msg="PullImage \"docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18\" failed" error="rpc error: code = Unavailable desc = failed to pull and unpack image \"docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18\": failed to prepare extraction snapshot \"extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378\": layer \"3496\" unavailable: unavailable"

[root@ip-10-2-17-73 /]# journalctl -u soci-snapshotter | grep "extract-779240847-Oy6"
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"key":"k8s.io/3573/extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378","level":"info","msg":"preparing filesystem mount at mountpoint=/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/3496/fs","parent":"sha256:7cb425d35fb31431c07c144f811e9aae2e16b39afbc5ff4ece873589fba5d29f","time":"2024-02-22T06:46:17.782280909Z"}
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"unable to fetch SOCI artifacts: cannot fetch list of referrers: no existing referrers","key":"k8s.io/3573/extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378","level":"warning","msg":"failed to prepare remote snapshot","parent":"sha256:7cb425d35fb31431c07c144f811e9aae2e16b39afbc5ff4ece873589fba5d29f","remote-snapshot-prepared":"false","time":"2024-02-22T06:46:17.782349136Z"}

debajyoti-truefoundry commented 7 months ago

This is also happening intermittently. I cannot reproduce this consistently.

debajyoti-truefoundry commented 7 months ago

I am also noticing a lot of logs like the ones below,

Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"failed(layer:\"sha256:1c74526957fc2157e8b0989072dc99b9582b398c12d1dcd40270fd76231bab0c\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\"): unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\", digest:\"sha256:1c74526957fc2157e8b0989072dc99b9582b398c12d1dcd40270fd76231bab0c\"): failed to refresh connection","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mount-point":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/118/fs","msg":"layer is unavailable","time":"2024-02-22T14:45:04.290019320Z"}
Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-aditya-ws-abc-usage-tracker-zh31f:5a8ad2dfd46440ee0eee86586874582fb84e6d27-7\", digest:\"sha256:aa8e0026efede8b3da7364fd0ec879657b2c9be209b5cc1e2ec83bed6dfcf6a9\")","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/323/fs","msg":"failed to refresh the layer \"sha256:aa8e0026efede8b3da7364fd0ec879657b2c9be209b5cc1e2ec83bed6dfcf6a9\" from \"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-aditya-ws-abc-usage-tracker-zh31f:5a8ad2dfd46440ee0eee86586874582fb84e6d27-7\"","time":"2024-02-22T14:45:04.290009572Z"}
Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"failed(layer:\"sha256:aa8e0026efede8b3da7364fd0ec879657b2c9be209b5cc1e2ec83bed6dfcf6a9\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-aditya-ws-abc-usage-tracker-zh31f:5a8ad2dfd46440ee0eee86586874582fb84e6d27-7\"): unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-aditya-ws-abc-usage-tracker-zh31f:5a8ad2dfd46440ee0eee86586874582fb84e6d27-7\", digest:\"sha256:aa8e0026efede8b3da7364fd0ec879657b2c9be209b5cc1e2ec83bed6dfcf6a9\"): failed to refresh connection","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/323/fs","msg":"check failed","time":"2024-02-22T14:45:04.290061220Z"}
Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"failed(layer:\"sha256:aa8e0026efede8b3da7364fd0ec879657b2c9be209b5cc1e2ec83bed6dfcf6a9\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-aditya-ws-abc-usage-tracker-zh31f:5a8ad2dfd46440ee0eee86586874582fb84e6d27-7\"): unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-aditya-ws-abc-usage-tracker-zh31f:5a8ad2dfd46440ee0eee86586874582fb84e6d27-7\", digest:\"sha256:aa8e0026efede8b3da7364fd0ec879657b2c9be209b5cc1e2ec83bed6dfcf6a9\"): failed to refresh connection","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mount-point":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/323/fs","msg":"layer is unavailable","time":"2024-02-22T14:45:04.290095138Z"}
Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\", digest:\"sha256:1b13d4e1a46e5e969702ec92b7c787c1b6891bff7c21ad378ff6dbc9e751d5d4\")","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/116/fs","msg":"failed to refresh the layer \"sha256:1b13d4e1a46e5e969702ec92b7c787c1b6891bff7c21ad378ff6dbc9e751d5d4\" from \"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\"","time":"2024-02-22T14:45:04.290119127Z"}
Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"failed(layer:\"sha256:1b13d4e1a46e5e969702ec92b7c787c1b6891bff7c21ad378ff6dbc9e751d5d4\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\"): unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\", digest:\"sha256:1b13d4e1a46e5e969702ec92b7c787c1b6891bff7c21ad378ff6dbc9e751d5d4\"): failed to refresh connection","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/116/fs","msg":"check failed","time":"2024-02-22T14:45:04.290136467Z"}
Feb 22 14:45:04 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"failed(layer:\"sha256:1b13d4e1a46e5e969702ec92b7c787c1b6891bff7c21ad378ff6dbc9e751d5d4\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\"): unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"111111.dkr.ecr.eu-west-1.amazonaws.com\", ref:\"111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-srihari-ws-emotion-class-svc-kg6df:21907524966ac31c03786bf29313489274f1694d-1\", digest:\"sha256:1b13d4e1a46e5e969702ec92b7c787c1b6891bff7c21ad378ff6dbc9e751d5d4\"): failed to refresh connection","key":"sha256:8eaf2218e5b78e1a434f669af263ecf4bfa31c5615b8db65beabb8595d3db19b","level":"warning","mount-point":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/116/fs","msg":"layer is unavailable","time":"2024-02-22T14:45:04.290146660Z"}

Is this normal even after setting up CRI-based authentication?

debajyoti-truefoundry commented 7 months ago

I noticed similar issues when using Docker Hub as a registry. In this case, I did not have the SOCI Index for the image present in the repository.

Pod Event:
Failed to pull image "docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18": rpc error: code = Unavailable desc = failed to pull and unpack image "docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18": failed to prepare extraction snapshot "extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378": layer "3496" unavailable: unavailable
Error: ErrImagePull

[root@ip-10-2-17-73 /]# journalctl -u kubelet | grep "extract-779240847-Oy6"
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal kubelet[3842]: E0222 06:46:17.804623    3842 remote_image.go:180] "PullImage from image service failed" err="rpc error: code = Unavailable desc = failed to pull and unpack image \"docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18\": failed to prepare extraction snapshot \"extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378\": layer \"3496\" unavailable: unavailable" image="docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18"

[root@ip-10-2-17-73 /]# journalctl -u containerd | grep "extract-779240847-Oy6"
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal containerd[3677]: time="2024-02-22T06:46:17.804322115Z" level=error msg="PullImage \"docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18\" failed" error="rpc error: code = Unavailable desc = failed to pull and unpack image \"docker.io/truefoundrycloud/tfy-truefoundry-abcd-onboarding-svc-8eavt:e2fcd343b85e5a48c73e8b0a9421a4969b5ac9a8-18\": failed to prepare extraction snapshot \"extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378\": layer \"3496\" unavailable: unavailable"

[root@ip-10-2-17-73 /]# journalctl -u soci-snapshotter | grep "extract-779240847-Oy6"
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"key":"k8s.io/3573/extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378","level":"info","msg":"preparing filesystem mount at mountpoint=/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/3496/fs","parent":"sha256:7cb425d35fb31431c07c144f811e9aae2e16b39afbc5ff4ece873589fba5d29f","time":"2024-02-22T06:46:17.782280909Z"}
Feb 22 06:46:17 ip-10-2-17-73.eu-west-1.compute.internal soci-snapshotter-grpc[3323]: {"error":"unable to fetch SOCI artifacts: cannot fetch list of referrers: no existing referrers","key":"k8s.io/3573/extract-779240847-Oy6- sha256:ae5b1556685ad8a7631a85f3e89607495b2d4ad931bcaf2c89acf6ea14471378","level":"warning","msg":"failed to prepare remote snapshot","parent":"sha256:7cb425d35fb31431c07c144f811e9aae2e16b39afbc5ff4ece873589fba5d29f","remote-snapshot-prepared":"false","time":"2024-02-22T06:46:17.782349136Z"}

This is something I started noticing on 0.5.0 with the CRI-based authentication. I had the soci snapshotter 0.4.0 in the cluster for some time and had not faced this issue before.

When a new pod with a newly built image of the same Docker Hub registry got scheduled into this node again, I faced the same problem. I had no soci index in the repository for the same image. Once I drained the problematic node and the pod got scheduled to another new node, with soci-snapshotter running there, the issue was not there anymore.

Sometimes, the snapshotter can get into this bad state after running for some time. I should have restarted the snapshotter, rather than draining the node, to see if the issue was resolved, but it slipped my mind.

turan18 commented 7 months ago

@debajyoti-truefoundry Hey, thanks for letting us know. We are looking into this. Could you provide a full dump of the snapshotter and containerd logs for the pull with an image with a SOCI index and without?

debajyoti-truefoundry commented 7 months ago

Please find the snapshotter log attached. @turan18

I did not have containerd logs with me. I will attach it as soon as this issue starts happening again. soci-snapshotter.log

debajyoti-truefoundry commented 7 months ago

Please find the snapshotter log attached. @turan18

I did not have containerd logs with me. I will attach it as soon as this issue starts happening again. soci-snapshotter.log

This is the full dump of all the logs. I have not filtered it for a particular image. But, the pod was stuck in this node due to the same issue.

debajyoti-truefoundry commented 7 months ago

@turan18 Adding logs from containerd and soci-snapshotter.

containerd.log soci-snapshotter.log

turan18 commented 7 months ago

When a new pod with a newly built image of the same Docker Hub registry got scheduled into this node again, I faced the same problem. I had no soci index in the repository for the same image. Once I drained the problematic node and the pod got scheduled to another new node, with soci-snapshotter running there, the issue was not there anymore.

Sometimes, the snapshotter can get into this bad state after running for some time. I should have restarted the snapshotter, rather than draining the node, to see if the issue was resolved, but it slipped my mind.

Hey @debajyoti-truefoundry, so a couple of things:

CRI credentials are stored in memory and so they do not persist over node reboot.
CRI credentials are static. If you were to run a pod with a set of images and then attempt to run the same pod with the same set of images on the same node, after some time (enough for the credentials to expire; which should be 12 hours in the case of ECR), the snapshotter would still attempt to use the in-memory expired CRI credentials from the first pod run.

The error you are seeing: failed to prepare extraction snapshot "extract-639718280-mVNW sha256:e3d7a120de01b04d536f8f8479b4f942dc3a4c6c52a4070e73c39e70f5a298b0": layer "551" unavailable: unavailable], is thrown when the snapshotter cannot establish a connection with the remote registry. In this case, it appears that the issue is related to authentication as we see tons of 403 responses from ECR. The snapshotter most likely attempted to use an older/expired set of CRI credentials for the image tfy-truefoundry-billing-server-1egxs. This is also why you were unable to reproduce the error on newer nodes, since the images were re-pulled and the CRI credentials were re-populated.

That is to say, unfortunately, CRI credentials are not a stable mechanism for getting credentials to the snapshotter and we do not recommend using it in a production environment. For the time being, we recommend disabling CRI credentials and setting up a docker credential helper on the nodes (see: ecr-credential-helper for ECR), if you haven't done so already.

Apologies, for the troubles.

debajyoti-truefoundry commented 7 months ago

Thanks, @turan18, for your detailed response.

In this case, it appears that the issue is related to authentication as we see tons of 403 responses from ECR. The snapshotter most likely attempted to use an older/expired set of CRI credentials for the image tfy-truefoundry-billing-server-1egxs. This is also why you were unable to reproduce the error on newer nodes, since the images were re-pulled and the CRI credentials were re-populated.

We do not have any credential refresher in our cluster and keep the credentials as a kubernetes.io/dockerconfigjson type Secret.

I assume the CRI credential is basically what we have in the Secret; it is not a different token derived from the given credentials with a different lifetime. Please correct me if this assumption is wrong.

So, if the credential expired, even on the new node, it should have used the expired credential and given an ErrImagePull. But I am not observing this. Somehow, it works on the new node.

We faced the same issue with a private image for Docker Hub for which we were using a long-living credential that could not have expired. Again, we noticed the same behaviour once I drained the node and the pod got scheduled on a new node; the issue did not happen again.

turan18 commented 7 months ago

Hey @debajyoti-truefoundry sorry for the delay.

I assume the CRI credential is basically what we have in the Secret; it is not a different token derived from the given credentials with a different lifetime. Please correct me if this assumption is wrong.

All EKS worker nodes configure the kubelet to talk to a ECR credential provider binary, which dynamically fetches ECR credentials. The credential helper binary comes pre-installed on all EKS nodes (see: https://github.com/awslabs/amazon-eks-ami/blob/8d7b5f89f511ef018905c8e24a6c1917e3b8bbdb/scripts/install-worker.sh#L391-L405). Now, although you have pod secrets configured, the kubelet is most likely exec'ing the credential helper binary to get credentials, instead of calling out to the API server to get your pod secrets. Generally, pod secrets are not a good mechanism for storing short-lived secrets, like ECR credentials, unless you wish to incur the burden of updating them yourself.

So, if the credential expired, even on the new node, it should have used the expired credential and given an ErrImagePull. But I am not observing this. Somehow, it works on the new node.

The reason it works on the new node is because the images are re-pulled using the credentials obtained by the image credential helper binary automatically configured on the node.

https://github.com/awslabs/soci-snapshotter/issues/1084#issuecomment-1959009826 for which we were using a long-living credential that could not have expired. Again, we noticed the same behaviour once I drained the node and the pod got scheduled on a new node; the issue did not happen again.

Yeah, I initially said: "then attempt to run the same pod with the same set of images on the same node, after some time (enough for the credentials to expire; which should be 12 hours in the case of ECR), the snapshotter would still attempt to use the in-memory expired CRI credentials from the first pod run.", it is actually a little more nuanced than that.

Containerd solely refers to layers/"snapshots" by the digest of their uncompressed content (it's actually a bit more complicated than that). What this means is that shared layers, starting from the base layer, between two different images will still have the same snapshot key/"digest". When containerd attempts to unpack a layer to disk, it first checks to see if the layer has already been unpacked, using the digest of the uncompressed layer as the key. If it sees that the key exists, then it will move on to the next layer. The problem with this is that it doesn't exactly fit the model of SOCI. To the SOCI snapshotter, layers are not just content but a set of configurations that are needed to fetch content. This includes things like the ECR repository URL and, more importantly, the credentials to the ECR repository/registry. What this then effectively means is that shared layers, regardless of what image they are a apart of, will use the same underlying SOCI configuration, meaning the same set of credentials.

This is exactly what's happening in your case. The image that resides in docker hub still shares layers, starting from the base layer, with the images previously pulled with ECR with SOCI. Because of this, containerd assumes that there is no point of downloading/unpacking the layer, since, in it's eyes, the content already exists. So the layer in the docker image ends up sharing the same SOCI mount point and thus the same set of expired credentials, as the layer in the ECR image that was pulled with SOCI.

Now, although there isn't an easy way to address the shared layers problem between images, at least without making some changes in upstream containerd, we can still make some changes to the SOCI snapshotter, like avoiding making any requests to the registry after all the content for the layer has been downloaded. This still isn't a universal solution, however, as there may be cases where, for some reason, the SOCI snapshotter couldn't fetch all the content in the background and will need to do so on demand at some point after the credentials have expired.

Currently, the best mechanism and the one that we recommend for getting credentials through to the snapshotter, is setting up a credential provider on the node. This way, on credential expiration, the snapshotter can retrieve a fresh set of credentials.

Apologies, if any of this was confusing, I'm happy to help clarify any questions that may arise.

debajyoti-truefoundry commented 7 months ago

sorry for the delay.

No worries. I appreciate the detailed response.

Now, although you have pod secrets configured, the kubelet is most likely exec'ing the credential helper binary to get credentials, instead of calling out to the API server to get your pod secrets.

Right. I tested this. Seems like that is what is happening. We use imagePullSecrets anyway, and I have no way to verify whether this binary is used or the imagePullSecret is used. 😅

[root@ip-10-2-28-241 /]# journalctl -u kubelet | grep image-credential
Mar 01 04:37:55 ip-10-2-28-241.eu-west-1.compute.internal kubelet[3063]: I0301 04:37:55.777992    3063 flags.go:64] FLAG: --image-credential-provider-bin-dir="/etc/eks/image-credential-provider"
Mar 01 04:37:55 ip-10-2-28-241.eu-west-1.compute.internal kubelet[3063]: I0301 04:37:55.777996    3063 flags.go:64] FLAG: --image-credential-provider-config="/etc/eks/image-credential-provider/config.json"
[root@ip-10-2-28-241 /]# cat req.json 
{ "apiVersion": "credentialprovider.kubelet.k8s.io/v1", "kind": "CredentialProviderRequest", "image": "1111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-sfy-manifest-service-v5nza:c4c2732e1d8639e70bb2845f93213c381b533dcf-221" }
[root@ip-10-2-28-241 /]# cat req.json | /etc/eks/image-credential-provider/ecr-credential-provider 
I0301 11:34:37.717329  244493 main.go:124] Getting creds for private image 1111111111.dkr.ecr.eu-west-1.amazonaws.com/tfy-truefoundry-sfy-manifest-service-v5nza:c4c2732e1d8639e70bb2845f93213c381b533dcf-221
{"kind":"CredentialProviderResponse","apiVersion":"credentialprovider.kubelet.k8s.io/v1","cacheKeyType":"Registry","cacheDuration":"6h0m0s","auth":{"1111111111.dkr.ecr.eu-west-1.amazonaws.com":{"username":"AWS","password":"redacted"}}}

What this then effectively means is that shared layers, regardless of what image they are a apart of, will use the same underlying SOCI configuration, meaning the same set of credentials. Now, although there isn't an easy way to address the shared layers problem between images, at least without making some changes in upstream containerd, we can still make some changes to the SOCI snapshotter, like avoiding making any requests to the registry after all the content for the layer has been downloaded.

Right, this seems to be a big issue and should be documented. I went through https://github.com/containerd/stargz-snapshotter/blob/main/docs/overview.md#cri-based-authentication, and this was not documented there. CRI-based Auth can only work reliably if the cluster has private images only from a single registry, ignoring the credential expiry issue.

Currently, the best mechanism and the one that we recommend for getting credentials through to the snapshotter, is setting up a credential provider on the node. This way, on credential expiration, the snapshotter can retrieve a fresh set of credentials.

Do you have any documentation for this? How do I integrate different credential providers for registries from different providers (Docker Hub, ECR)?

debajyoti-truefoundry commented 7 months ago

@turan18 Is this limitation present in other snapshotters too?

https://github.com/containerd/nydus-snapshotter https://github.com/containerd/stargz-snapshotter

debajyoti-truefoundry commented 7 months ago

https://github.com/containerd/stargz-snapshotter/issues/1583 Is this related?

I also want to understand whether https://github.com/awslabs/soci-snapshotter/pull/1035 helps resolve this issue.

turan18 commented 6 months ago

Right, this seems to be a big issue and should be documented. I went through https://github.com/containerd/stargz-snapshotter/blob/main/docs/overview.md#cri-based-authentication, and this was not documented there. CRI-based Auth can only work reliably if the cluster has private images only from a single registry, ignoring the credential expiry issue.

CRI creds work fine, even with shared layers, as long as the registry credentials are not expired. The issue is with credential rotation.

Do you have any documentation for this? How do I integrate different credential providers for registries from different providers (Docker Hub, ECR)?

Unfortunately, we do not right now. You can follow the documentation on the ecr-credential-helper repository on how to get the helper installed and set up on the node. I'm not too sure how DockerHub handles credentials. If they are long lived/do not expire you can run docker login on the node with your username/id and password (see: https://docs.docker.com/reference/cli/docker/login/).

Is this limitation present in other snapshotters too?

https://github.com/containerd/nydus-snapshotter https://github.com/containerd/stargz-snapshotter

Yes. Both stargz and nydus also rely on the same mechanisms to get credentials (credential helper or CRI creds). There are some on-going discussions upstream about making the process of obtaining credentials in snapshotter's more streamlined, but there isn't anything concrete just yet (see: https://github.com/containerd/containerd/issues/6251, https://github.com/containerd/containerd/issues/5251).

https://github.com/containerd/stargz-snapshotter/issues/1583 Is this related?

Yes, it looks like that's the same issue and the proposed fix was what I mentioned earlier, which is to ensure we do not make requests once the content for the image is fully downloaded on disk. This is a viable work-around for the time being and something we will work on as-well.

I also want to understand whether https://github.com/awslabs/soci-snapshotter/pull/1035 helps resolve this issue.

Not really. All that will do is ensure that images that cannot be lazy loaded at-all are managed/unpacked by containerd. If the image has already been pulled with the SOCI snapshotter, like in your case, and you attempt to to run another image with shared layers after the credentials have expired, containerd will still see the content as pulled and it will use the same underlying SOCI configuration.

debajyoti-truefoundry commented 6 months ago

CRI creds work fine, even with shared layers, as long as the registry credentials are not expired. The issue is with credential rotation.

Right. So there are two issues.

ECR creds are expiring within 6/12 hours. The credential that the soci-snapshotter is caching will get expired.
Shared layers between images present in DockerHub and ECR.

The reason image pulls from ECR were failing was due to 1. Image pulls from Docker Hub were failing due to 1 and 2. As I have mentioned above, The credentials we set for Docker Hub never expire.

This is exactly what's happening in your case. The image that resides in docker hub still shares layers, starting from the base layer, with the images previously pulled with ECR with SOCI. Because of this, containerd assumes that there is no point of downloading/unpacking the layer, since, in it's eyes, the content already exists. So the layer in the docker image ends up sharing the same SOCI mount point and thus the same set of expired credentials, as the layer in the ECR image that was pulled with SOCI.

In this case, a layer from the DockerHub image shares the SOCI mount point of a layer from ECR with expired credentials. In the container's eye, these are logically the same layer. In this case, any extra API call to the registry will fail.

Yes, it looks like that's the same issue and the proposed fix was what I mentioned earlier, which is to ensure we do not make requests once the content for the image is fully downloaded on disk. This is a viable work-around for the time being and something we will work on as-well.

Will you be able to give a timeline for this? This will open up adoption for us.

Not really. All that will do is ensure that images that cannot be lazy loaded at-all are managed/unpacked by containerd.

Understood. I brought this up because I was thinking with the mental model that if there is any error from the soci-snapshotter, it can defer the process to containerd.

debajyoti-truefoundry commented 6 months ago

Unfortunately, we do not right now. You can follow the documentation on the ecr-credential-helper repository on how to get the helper installed and set up on the node.

Is this even solving the issue at the soci-snapshotter layer?

I can see that the Authn details are cached here. https://github.com/awslabs/soci-snapshotter/blob/1d4d288d008d5f9b67f7179e25e2b6f22e8d1365/service/keychain/cri/v1alpha/cri.go#L80 The cache is populated here. https://github.com/awslabs/soci-snapshotter/blob/1d4d288d008d5f9b67f7179e25e2b6f22e8d1365/service/keychain/cri/v1alpha/cri.go#L130

So, the cached credentials are only updated if an image is pulled. Looking at https://github.com/containerd/stargz-snapshotter/issues/1583, it seems credentials will only get updated if we Always pull the Image.

Now, although you have pod secrets configured, the kubelet is most likely exec'ing the credential helper binary to get credentials, instead of calling out to the API server to get your pod secrets.

Kubelet is already using a credential helper.

Yes, it looks like that's the same issue and the proposed fix was what I mentioned earlier, which is to ensure we do not make requests once the content for the image is fully downloaded on disk. This is a viable work-around for the time being and something we will work on as-well.

The only fix here seems to be, "Do not connect the registry again for layer A if soci-snapshotter has already set it up". Can soci-snapshotter work with that assumption?

sondavidb commented 6 months ago

The only fix here seems to be, "Do not connect the registry again for layer A if soci-snapshotter has already set it up". Can soci-snapshotter work with that assumption?

It should, I've opened up an issue and will try and get a PR out soon to test this.

debajyoti-truefoundry commented 6 months ago

It should, I've opened up an https://github.com/awslabs/soci-snapshotter/issues/1116 and will try and get a PR out soon to test this.

Thanks @sondavidb. I am worried about another scenario.

Image A is cached on Node N. Assuming that it was already set up using the soci-snapshotter before and at present, there are no running containers in Node N using Image A.
A Pod having a container with Image A gets scheduled on Node N.
Assuming that the credentials of Image A's registry have expired or Kubelet is not passing credentials, given that Image A is already cached in the Node, will the soci-snapshotter still work?

steved commented 6 months ago

@sondavidb I'm having trouble reproducing since the expiry time is 12h for an ECR token, but even with the amazon-ecr-credential-helper we're seeing that after the authorization expires it's never refreshed.

These are the logs in question. It's different than what's in this issue, so happy to open a new one if needed.

Mar 28 17:20:52 ip-10-0-1-2.us-west-2.compute.internal soci-snapshotter-grpc[690]: {"level":"info","msg":"Received status code: 403 Forbidden. Authorizing...","time":"2024-03-28T17:20:52.858737822Z"}
Mar 28 17:20:52 ip-10-0-1-2.us-west-2.compute.internal soci-snapshotter-grpc[690]: {"error":"failed to refresh URL: status 403 Forbidden","key":"sha256:[sha256]","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/51/fs","msg":"failed to connect to blob","time":"2024-03-28T17:20:52.859196457Z"}
Mar 28 17:20:52 ip-10-0-1-2.us-west-2.compute.internal soci-snapshotter-grpc[690]: {"error":"unable to create remote fetcher: failed to redirect: request to registry failed: unknown \"unknown\": giving up request after 1 attempt(s): context canceled (host \"0123456789012.dkr.ecr.us-west-2.amazonaws.com\", ref:\"0123456789012.dkr.ecr.us-west-2.amazonaws.com/repo/image:tag\", digest:\"sha256:[sha256]\")","key":"sha256:[sha256]","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/52/fs","msg":"failed to refresh the layer \"sha256:[sha256]\" from \"0123456789012.dkr.ecr.us-west-2.amazonaws.com/repo/image:tag\"","time":"2024-03-28T17:20:52.858771180Z"}

(same instance, different occurrence)

Mar 28 15:41:12 ip-10-0-1-2.us-west-2.compute.internal soci-snapshotter-grpc[690]: {"error":"unable to create remote fetcher: failed to redirect: request to registry failed: failed to handle challenge: failed to find supported auth scheme: not implemented (host \"0123456789012.dkr.ecr.us-west-2.amazonaws.com\", ref:\"0123456789012.dkr.ecr.us-west-2.amazonaws.com/repo/image:tag\", digest:\"sha256:[sha256]\")","key":"sha256:[sha256]","level":"warning","mountpoint":"/var/lib/soci-snapshotter-grpc/snapshotter/snapshots/79/fs","msg":"failed to refresh the layer \"sha256:[sha256]\" from \"0123456789012.dkr.ecr.us-west-2.amazonaws.com/repo/image:tag\"","time":"2024-03-28T15:41:12.206163948Z"}

Restarting soci-snapshotter.service fixes the issue. Pods are running with pullPolicy: Always.

# uname -a
Linux [snip] 6.5.0-1016-aws #16~22.04.1-Ubuntu SMP Wed Mar 13 18:54:49 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
# containerd --version
containerd github.com/containerd/containerd v1.7.13 7c3aca7a610df76212171d200ca3811ff6096eb8
# soci-snapshotter-grpc --version
soci-snapshotter-grpc version v0.5.0 77f218d0b11c88d17a832ad7456dbd2165d19c86
# kubelet --version
Kubernetes v1.28.5-eks-5e0fdde

sondavidb commented 6 months ago

Sorry for the delay in response @debajyoti-truefoundry

Thanks @sondavidb. I am worried about another scenario. Image A is cached on Node N. Assuming that it was already set up using the soci-snapshotter before and at present, there are no running containers in Node N using Image A. A Pod having a container with Image A gets scheduled on Node N. Assuming that the credentials of Image A's registry have expired or Kubelet is not passing credentials, given that Image A is already cached in the Node, will the soci-snapshotter still work?

As of now, no, I don't have a reason to believe it would work, for the same reason that we always check the registry. With the fix from stargz I think this might work, but I wonder if it poses any security threats. Have made a PR for this in the meantime #1147

sondavidb commented 6 months ago

I'm having trouble reproducing since the expiry time is 12h for an ECR token, but even with the amazon-ecr-credential-helper we're seeing that after the authorization expires it's never refreshed.

We don't do any form of credential rotation. Whatever credentials are pulled with the image are the ones we use for the lifecycle of the container.

steved commented 6 months ago

I'm having trouble reproducing since the expiry time is 12h for an ECR token, but even with the amazon-ecr-credential-helper we're seeing that after the authorization expires it's never refreshed.

We don't do any form of credential rotation. Whatever credentials are pulled with the image are the ones we use for the lifecycle of the container.

Setting aside K8s, it does seem like https://github.com/awslabs/soci-snapshotter/pull/945 made an attempt to. I split out this conversation to a new issue (https://github.com/awslabs/soci-snapshotter/issues/1148) so I don't muddle it more, apologies!

sondavidb commented 5 months ago

Hey @debajyoti-truefoundry, we just released v0.6.0, wondering if you could confirm that this mitigates this particular issue?

debajyoti-truefoundry commented 5 months ago

Hey @sondavidb, thanks for the ping. I will try it again on our cluster sometime in the next two weeks.

For me, #1148 and #1093 will also be crucial for adoption. But let me try to carve out some time to get v0.6.0 once.

sondavidb commented 5 months ago

Thanks!

I got a little sidetracked on investigation for 1148 but I'm taking a look. I will confirm I can repro the solution, though.