Open dprotaso opened 2 years ago
cc @jwcesign
https://github.com/google/go-containerregistry/pull/1280 was supposed to handle the case where auth is correctly configured for registry.com/foo
and registry.com/bar
and possibly also just registry.com
-- if your request is for /foo
, you should get creds for foo, i.e., as specific as possible. If your request is for /baz
you get creds for registry.com
. This assumes that auth is configured correctly in all cases, and we should just be as specific as possible in selecting among those valid creds.
If there's a bug in that logic though, we should fix it -- the kaniko issue that inspired that change (https://github.com/GoogleContainerTools/kaniko/issues/687) was reopened because the behavior persisted.
This doesn't account for the case where there are two creds configured both for the same registry(+repo), but one is invalid and the other works. It sounds like kubelet will collect all matching auths and try them in a loop, and ggcr (currently) doesn't -- it will choose one at ~random and use it, even if it fails and the other would have worked.
It will be hard, but not impossible, for pkg/v1/remote to handle this fallback scenario, and try creds for /foo
then fallback to creds for /
if it gets a 403 response. That should match kubelet's behavior linked from the Knative issue above.
I did test as follows: first create secret:
root@cesign [12:38:20 AM] [+47.0°C] [~/git/group-service-acceleration] [main *]
-> # kubectl create secret docker-registry jwcesign \
--docker-server=https://index.docker.io/v1/ \
--docker-username=jwcesign \
--docker-password=xxx \
--docker-email=jwcesign@163.com
root@cesign [12:40:14 AM] [+45.0°C] [~/git/group-service-acceleration] [main *]
-> # kubectl create secret docker-registry cesign \
--docker-server=https://index.docker.io/v1/ \
--docker-username=cesign \
--docker-password=xxx \
--docker-email=jwcesign@gmail.com
then create service:
root@cesign [12:38:59 AM] [+47.0°C] [~/git/group-service-acceleration/config] [main *]
-> # cat group-service-c.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: group-service-c
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "1"
spec:
containerConcurrency: 1
timeoutSeconds: 30
imagePullSecrets:
- name: jwcesign
- name: cesign
containers:
- image: cesign/secret-test:latest
env:
- name: NOW_SERVICE
value: "group-service-c"
- name: NEXT_SERVICE
value: ""
It will failed with 401 when resolving, but if with registries-skipping-tag-resolving
, it works;
So from this testing, I think just fix resolving with loop check with creds.
It will be hard, but not impossible, for pkg/v1/remote to handle this fallback scenario, and try creds for /foo then fallback to creds for / if it gets a 403 response. That should match kubelet's behavior linked from the Knative issue above.
I think even with this implementation, the problem can't be fixed
cc @imjasonh @dprotaso
It sounds like kubelet will collect all matching auths and try them in a loop, and ggcr (currently) doesn't -- it will choose one at ~random and use it, even if it fails and the other would have worked.
Correct, see here.
This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Keep fresh with the 'lifecycle/frozen' label.
lifecycle/frozen
Describe the bug
See: https://github.com/knative/serving/issues/13126
To Reproduce
See linked issue: