Closed VladAlexF closed 3 months ago
+1
Additionally, a similar problem occurs after adding the CR API Gateway if we have images in a private registry:
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: api-gateway
namespace: consul
spec:
gatewayClassName: consul
listeners:
...
Once you add it, it creates itself
ServiceAccount and deployment pointing to the ServiceAccount that invokes the pods of a given API Gateway. In the above ServiceAccount is also missing imagePullSecrets
Init Container (consul-connect-inject-init) can't pull image from private registry.
Init Containers:
consul-connect-inject-init:
Container ID:
Image: <private-registry>/consul-k8s-control-plane:1.4.1
Image ID:
Port: <none>
Host Port: <none>
Command:
/bin/sh
-ec
consul-k8s-control-plane connect-init -pod-name=${POD_NAME} -pod-namespace=${POD_NAMESPACE} \
-gateway-kind="api-gateway" \
-log-json=false \
-service-account-name="my-own-api-gateway" \
-service-name="my-own-api-gateway"
State: Waiting
Reason: ImagePullBackOff
Ready: False
ServiceAccount -service-account-name="my-own-api-gateway"
does not contain imagePullSecrets
Affected version consul chart: 1.18.1
Affected version consul chart: 1.19.0 too.
I was able to mostly recreate this locally. However the gateway-resources-job
was able to start because other pods had already pulled the image. Are you seeing that same behavior, that the job will eventually run? Looks like this error is generally getting swallowed. Do you have a specific imagePullPolicy or something set?
~ Edited to say I think we should fix this, just want to know if it is blocking, since my cluster seems okay to come up.
Events from the gateway-resources-job : |
Type | Reason | Age | From | Message |
---|---|---|---|---|---|
Normal | Scheduled | 10s | default-scheduler | Successfully assigned consul/consul-gateway-resources-gsgbw to kind-control-plane | |
Normal | Pulling | 9s | kubelet | Pulling image "private-repo-image" | |
Warning | Failed | 5s | kubelet | Failed to pull image "private-repo-image": rpc error: code = Unknown desc = failed to pull and unpack image "private-repo-image": failed to resolve reference "private-repo-image": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed | |
Warning | Failed | 5s | kubelet | Error: ErrImagePull | |
Normal | Pulled | 4s | kubelet | Container image "private-repo-image" already present on machine | |
Normal | Created | 4s | kubelet | Created container gateway-resources | |
Normal | Started | 3s | kubelet | Started container gateway-resources |
My case:
global:
datacenter: mycenter
name: consul
image: myregistry.azurecr.io/repo/release/consul:1.19.1
imageK8S: myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.1
imageConsulDataplane: myregistry.azurecr.io/repo/release/consul-dataplane:1.5.1
imagePullSecrets:
- name: myregistry.azurecr.io-access
Job/server-consul-gateway-resources
can't pull image from private registry:$ k get job -A | grep consul
tool server-consul-gateway-resources 0/1 30m 30m
tool server-consul-server-acl-init 1/1 2m21s 30m
k describe pod server-consul-gateway-resources-st4dp -n tool
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned common/server-consul-gateway-resources-st4dp to aks-nodepool1-mynodepool
Warning Failed 11m (x6 over 12m) kubelet Error: ImagePullBackOff
Normal Pulling 10m (x4 over 12m) kubelet Pulling image "myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.1"
Warning Failed 10m (x4 over 12m) kubelet Failed to pull image "myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.1":
failed to pull and unpack image "myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.1":
failed to resolve reference "myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.1":
failed to authorize: failed to fetch anonymous token:
unexpected status from GET request to https://myregistry.azurecr.io/oauth2/token?scope=repository%3Arepo%2Frelease%2Fconsul-k8s-control-plane%3Apull&service=myregistry.azurecr.io: 401 Unauthorized
Warning Failed 10m (x4 over 12m) kubelet Error: ErrImagePull
Normal BackOff 2m11s (x45 over 12m) kubelet Back-off pulling image "myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.1"
Do you have a specific imagePullPolicy or something set? No.
Hi, one more update. I was able to recreate this issue completely by setting the imagePullPolicy
to "Always". Is it possible your app/cluster sets it to "Always" be default? It may work to try setting it to "IfNotPresent" in the meantime as a workaround until our fix is in.
In version 1.19.2 you fixed the bug only for pulling the consul-k8s-control-plane image, which is needed to init container.
The CR api gateway itself still has to pull its consul-dataplane docker image. This bug remains for it. This is because when we deploy CR with kind: Gateway, the correct deployment will be created with the correct pointing to serviceAccount, but unfortunately, this serviceAccount named <gateway-name>-gateway
does not contain imagePullSecret in its definition.
command to find it:
kubectl get serviceAccount my-api-gateway -n ns
Invalid serviceAccount definition:
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
component: api-gateway
gateway.consul.hashicorp.com/created: "1725890955"
gateway.consul.hashicorp.com/managed: "true"
gateway.consul.hashicorp.com/name: int-mesh-gateway
gateway.consul.hashicorp.com/namespace: test
name: my-api-gateway
namespace: test
ownerReferences:
- apiVersion: gateway.networking.k8s.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Gateway
name: my-api-gateway
I have been waiting for this fix for a long time, it is a pity that you fixed it only for the container init, and not for the api gateway container.
Logs in 1.19.2 version:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned spoc/my-api-gateway-58c4f65b9c-82ktg to aks-nodepool1-myVM
Normal Pulled 14m kubelet Pulling image "myregistry.azurecr.io/repo/release/consul-k8s-control-plane:1.5.3"
Normal Created 14m kubelet Created container consul-connect-inject-init
Normal Started 14m kubelet Started container consul-connect-inject-init
Normal Pulling 13m (x4 over 14m) kubelet Pulling image "myregistry.azurecr.io/repo/release/consul-dataplane:1.5.3"
Warning Failed 13m (x4 over 14m) kubelet Failed to pull image "myregistry.azurecr.io/repo/release/consul-dataplane:1.5.3": failed to pull and unpack image "myregistry.azurecr.io/repo/release/consul-dataplane:1.5.3": failed to resolve reference "myregistry.azurecr.io/repo/release/consul-dataplane:1.5.3": failed to authorize: failed to fetch anonymous token: unexpected status from GET request to https://myregistry.azurecr.io/oauth2/token?scope=repository%3Arelease%2Fconsul-dataplane%3Apull&service=myregistry.azurecr.io: 401 Unauthorized
Warning Failed 13m (x4 over 14m) kubelet Error: ErrImagePull
Warning Failed 13m (x5 over 14m) kubelet Error: ImagePullBackOff
Normal BackOff 4m39s (x42 over 14m) kubelet Back-off pulling image "myregistry.azurecr.io/repo/release/consul-dataplane:1.5.3"
Please test this properly in full scope. To do this you need to run chart with all images added to private registry with authorization, then deploy yaml file with custom CR definition -> for api gateway.
I finally found the error:
https://github.com/hashicorp/consul-k8s/blob/v1.5.3/control-plane/gateways/serviceaccount.go
The ServiceAccount definition is missing the imagePullSecret if it was added in the helm chart under
global:
image: myacr.azurecr.io/release/consul:1.19.2
imageK8S: myacr.azurecr.io/release/consul-k8s-control-plane:1.5.3
imageConsulDataplane: myacr.azurecr.io/release/consul-dataplane:1.5.3
imagePullSecrets:
- name: mspodemo.azurecr.io-access
I added the bug correctly as a new report as this is already closed.
Community Note
Overview of the Issue
consul-k8s/charts/consul/templates/gateway-resources-serviceaccount.yaml
is missingimagePullSecrets
, which breaks the usage of private docker registries, as the Gateway Resources Job cannot pull theconsul-k8s-control-plane
image from private registries without these secrets.Note, other service accounts do include the
imagePullSecrets
, and therefore other pods can successfully pull from the private registry.Reproduction Steps
values.yaml
file:<release-name>-gateway-resources
job cannot launch containers, as it cannot pull the image from the private registry, due to missing imagePullSecrets on the service account the job uses.Logs
The container cannot produce logs as it doesn't start, so kubernetes events for the pod from command
kubectl -n consul describe pod consul-gateway-resources-2fz5z
are provided:Expected behavior
The helm install can successfully pull images from the private registry, and run the gateway-resources job.
Environment details