Mirantis / cri-dockerd

dockerd as a compliant Container Runtime Interface for Kubernetes
https://mirantis.github.io/cri-dockerd/
Apache License 2.0
1.1k stars 291 forks source link

how to config private registry login when using in k8s #97

Closed lidh15 closed 2 years ago

lidh15 commented 2 years ago

In k8s with containerd, registry login is done with cri plugin and credentials are logged in containerd config.toml.

but when using K8s with docker and cri-docker, it cannot find login credentials which seems to be controlled by docker rather than cri-docker.

evol262 commented 2 years ago

This is documented here. The credentials are passed in by k8s.

jhill-afs commented 1 year ago

This is marked as closed. However, I am experiencing a similar/related issue. I am attempting to deploy a cluster using kubeadm with kubelet configured to use cri-dockerd. However I am attempting to pull the control plane images (eg kube-apiserver) from a private registry. I attempted to configure the credentials both in the root docker config /root/.docker/config.json and in the containerd configuration /etc/containerd/config.toml. The images were still not correctly pulled. Once I manually pulled the images via docker pull as root, the control plane containers were successfully started.

The documented "fix" linked above does not work in this case as I do not yet have a running cluster in which to configure the secret.

evol262 commented 1 year ago

There's not enough information here, @jhill-afs. Can you post your ClusterConfiguration?

Can you post logs?

The documented "fix" above is not the same as "bootstrap a cluster from a private registry". Lots more information is needed to begin to help here.

jhill-afs commented 1 year ago

Here is my cluster configuration

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
controlPlaneEndpoint: "10.1.5.11:6443"
apiServer:
  extraArgs:
    # anonymous-auth: "false" omitting for now https://github.com/kubernetes/kubeadm/issues/1105
    authorization-mode: "Node,RBAC"
    enable-admission-plugins: AlwaysPullImages,EventRateLimit,NodeRestriction #,ImagePolicyWebhook,SecurityContextDeny,PodSecurityPolicy
    profiling: "false"
    audit-log-path: /etc/kubernetes/apiserver/audit.log
    audit-log-maxage: "30"
    audit-log-maxbackup: "10"
    audit-log-maxsize: "100"
    audit-policy-file: /etc/kubernetes/apiserver/audit-policy.yaml
    audit-log-format: json # for fluentd
    encryption-provider-config: /etc/kubernetes/apiserver/encryptionconf.yaml
    tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
    admission-control-config-file: /etc/kubernetes/apiserver/admissioncontrol.conf
  extraVolumes:
  - name: configsvol
    hostPath: /etc/kubernetes/apiserver
    mountPath: /etc/kubernetes/apiserver
    readOnly: false # for now for audit log write
    pathType: Directory
  timeoutForControlPlane: 4m0s
certificatesDir: /etc/kubernetes/pki
controllerManager:
  extraArgs:
    profiling: "false"
    use-service-account-credentials: "true"
    feature-gates: "RotateKubeletServerCertificate=true" # supposed to be default but failing benchmark
scheduler:
  extraArgs:
    profiling: "false"
clusterName: kubernetes
dns:
  imageTag: "1.10.0"
  imageRepository: "registry1.dso.mil/ironbank/opensource/coredns"
etcd:
  local:
    imageTag: "v3.5.6"
    imageRepository: "registry1.dso.mil/ironbank/opensource/etcd"
kubernetesVersion: "v1.25.5"
imageRepository: "registry1.dso.mil/ironbank/opensource/kubernetes"
networking:
  podSubnet: "10.66.0.0/16" # matches weave subnet
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
metricsBindAddress: 0.0.0.0:10249
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
serverTLSBootstrap: true
readOnlyPort: 0
streamingConnectionIdleTimeout: 10s
protectKernelDefaults: true
makeIPTablesUtilChains: true
featureGates:
  SeccompDefault: true
seccompDefault: true
eventRecordQPS: 10
tlsCipherSuites: ["TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305","TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_RSA_WITH_AES_256_GCM_SHA384","TLS_RSA_WITH_AES_128_GCM_SHA256"]
cgroupDriver: systemd
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  criSocket: "unix:///var/run/cri-dockerd.sock"
patches:
  directory: /etc/kubernetes/patches
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: JoinConfiguration
patches:
  directory: /etc/kubernetes/patches

Here is the error I am seeing:

syslog:Jan 11 21:34:37 vm-controller1 kubelet[22950]: E0111 21:34:37.626471   22950 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-controller-manager\" with ImagePullBackOff: \"Back-off pulling image \\\"registry1.dso.mil/ironbank/opensource/kubernetes/kube-controller-manager:v1.25.5\\\"\"" pod="kube-system/kube-controller-manager-vm-controller1" podUID=185e63dbc5a393d809a068f7c0947f05
syslog:Jan 11 21:34:39 vm-controller1 kubelet[22950]: E0111 21:34:39.607229   22950 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"etcd\" with ImagePullBackOff: \"Back-off pulling image \\\"registry1.dso.mil/ironbank/opensource/etcd/etcd:v3.5.6\\\"\"" pod="kube-system/etcd-vm-controller1" podUID=ff619fb2938a9eb94f51b77078a0c7e9

Pause is pulled from the k8s repo, and hangs:

root@vm-controller1:/etc/kubernetes# docker ps | grep k8s
64b974855464        registry.k8s.io/pause:3.6                                                 "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD_kube-scheduler-vm-controller1_kube-system_b90c3363f76e7c21a35a634dda857599_0
d45ab4a8a048        registry.k8s.io/pause:3.6                                                 "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD_kube-controller-manager-vm-controller1_kube-system_185e63dbc5a393d809a068f7c0947f05_0
62521f85b165        registry.k8s.io/pause:3.6                                                 "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD_kube-apiserver-vm-controller1_kube-system_8a3ecdc0918307e65451e86c9c80bfee_0
7889c41f7399        registry.k8s.io/pause:3.6                                                 "/pause"                 8 minutes ago       Up 8 minutes                            k8s_POD_etcd-vm-controller1_kube-system_ff619fb2938a9eb94f51b77078a0c7e9_0
root@vm-controller1:/etc/kubernetes# 

If i manually pull images, they control plane containers can start:

root@vm-controller1:/etc/kubernetes# docker pull registry1.dso.mil/ironbank/opensource/kubernetes/kube-scheduler:v1.25.5
v1.25.5: Pulling from ironbank/opensource/kubernetes/kube-scheduler
0e0c4af1097a: Pull complete 
a87b3924b83a: Pull complete 
838076275382: Pull complete 
Digest: sha256:ec35eb15c1612738e5b3ff2525ad9a636e397c2762e333237d93d81eef042dbf
Status: Downloaded newer image for registry1.dso.mil/ironbank/opensource/kubernetes/kube-scheduler:v1.25.5
registry1.dso.mil/ironbank/opensource/kubernetes/kube-scheduler:v1.25.5
root@vm-controller1:/etc/kubernetes# docker ps | grep k8s
b4447e9d0ef7        8b1eedf9d2b4                                                              "kube-scheduler --au…"   1 second ago        Up 1 second                             k8s_kube-scheduler_kube-scheduler-vm-controller1_kube-system_b90c3363f76e7c21a35a634dda857599_7
64b974855464        registry.k8s.io/pause:3.6                                                 "/pause"                 11 minutes ago      Up 11 minutes                           k8s_POD_kube-scheduler-vm-controller1_kube-system_b90c3363f76e7c21a35a634dda857599_0
evol262 commented 1 year ago

I'd suggest opening an upstream issue, because pre-pulling is their recommendation.

stephan2012 commented 1 year ago

To bootstrap a Kubernetes cluster in an air-gap scenario, there is no other way than to pre-pull the container images. However, this is not enough because cri-dockerd is responsible for pulling the pause (infra) image and requires credentials, too. Otherwise, the kubelet cannot start Pods. The solution is to create /.docker/config.json (yes, in the root directory).

Storing the credentials also always brings back the container images if all have gone.

I know this issue is closed, but hopefully, this information helps.

flavienbwk commented 1 year ago

Thank you @stephan2012. Creating the /.docker/config.json file works. Do you know where this path can be configured with the cri-dockerd command ?

stephan2012 commented 1 year ago

Do you know where this path can be configured with the cri-dockerd command ?

The path was hard-coded if I remember it right. I'd be happy if someone proved me wrong. ;-)