rancher / rke

Rancher Kubernetes Engine (RKE), an extremely simple, lightning fast Kubernetes distribution that runs entirely within containers.
Apache License 2.0
3.22k stars 583 forks source link

rke up pulling from docker.io when default private registry configured #1850

Closed sourcedelica closed 3 years ago

sourcedelica commented 4 years ago

RKE version: v1.0.0

Docker version: (docker version,docker info preferred)

$ docker version
Client:
 Version:           18.09.4
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        d14af54266
 Built:             Wed Mar 27 18:34:51 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.4
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d14af54
  Built:            Wed Mar 27 18:04:46 2019
  OS/Arch:          linux/amd64
  Experimental:     false

$ docker info
Containers: 41
 Running: 11
 Paused: 0
 Stopped: 30
Images: 199
Server Version: 18.09.4
Storage Driver: overlay
 Backing Filesystem: xfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: yn622bqmivg6sud8gxi9q9cqb
 Is Manager: true
 ClusterID: l55jll70cxk3ochobfok2cp8n
 Managers: 3
 Nodes: 3
 Default Address Pool: 10.0.0.0/8
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 172.16.41.254
 Manager Addresses:
  10.170.228.65:2377
  10.170.228.66:2377
  172.16.41.254:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-862.6.3.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 62.76GiB
Name: usdlvfal500
ID: CEMD:HBWT:FNQI:NRLV:KD2M:DGQX:FHHT:4OID:DDM6:BYCU:BPZW:X53N
Docker Root Dir: /app/docker
Debug Mode (client): false
Debug Mode (server): false
HTTP Proxy: http://xxxxx:xxxxx@hfcproxy.mycorp.com:8080
HTTPS Proxy: http://xxxxx:xxxxx@hfcproxy.mycorp.com:8080
No Proxy: localhost,localhost.localdomain,mycorp.com,twusgrid01.mycorp.com,artifactory.mycorp.com,usdlvart01
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 usdlvart01:8081
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

$ cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.5 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.5"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.5 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.5:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.5"

$ uname -r
3.10.0-862.6.3.el7.x86_64

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) Bare-metal

cluster.yml file:

$ cat cluster.yml
# If you intened to deploy Kubernetes in an air-gapped environment,
# please consult the documentation on how to configure custom RKE images.
nodes:
- address: 172.16.41.254
  port: "22"
  internal_address: ""
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: usdlvfal500
  user: epederson
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
- address: 10.170.228.65
  port: "22"
  internal_address: ""
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: usdlpfal501
  user: epederson
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
- address: 10.170.228.66
  port: "22"
  internal_address: ""
  role:
  - controlplane
  - worker
  - etcd
  hostname_override: usdlpfal502
  user: epederson
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
services:
  etcd:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    external_urls: []
    ca_cert: ""
    cert: ""
    key: ""
    path: ""
    uid: 0
    gid: 0
    snapshot: null
    retention: ""
    creation: ""
    backup_config: null
  kube-api:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    service_cluster_ip_range: 10.43.0.0/16
    service_node_port_range: ""
    pod_security_policy: false
    always_pull_images: false
    secrets_encryption_config: null
    audit_log: null
    admission_configuration: null
    event_rate_limit: null
  kube-controller:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    cluster_cidr: 10.42.0.0/16
    service_cluster_ip_range: 10.43.0.0/16
  scheduler:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
  kubelet:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    cluster_domain: cluster.local
    infra_container_image: ""
    cluster_dns_server: 10.43.0.10
    fail_swap_on: false
    generate_serving_certificate: false
  kubeproxy:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
network:
  plugin: canal
  options: {}
  node_selector: {}
authentication:
  strategy: x509
  sans: []
  webhook: null
addons: ""
addons_include: []
system_images:
  etcd: rancher/coreos-etcd:v3.3.15-rancher1
  alpine: rancher/rke-tools:v0.1.51
  nginx_proxy: rancher/rke-tools:v0.1.51
  cert_downloader: rancher/rke-tools:v0.1.51
  kubernetes_services_sidecar: rancher/rke-tools:v0.1.51
  kubedns: rancher/k8s-dns-kube-dns:1.15.0
  dnsmasq: rancher/k8s-dns-dnsmasq-nanny:1.15.0
  kubedns_sidecar: rancher/k8s-dns-sidecar:1.15.0
  kubedns_autoscaler: rancher/cluster-proportional-autoscaler:1.7.1
  coredns: rancher/coredns-coredns:1.6.2
  coredns_autoscaler: rancher/cluster-proportional-autoscaler:1.7.1
  kubernetes: rancher/hyperkube:v1.16.3-rancher1
  flannel: rancher/coreos-flannel:v0.11.0-rancher1
  flannel_cni: rancher/flannel-cni:v0.3.0-rancher5
  calico_node: rancher/calico-node:v3.8.1
  calico_cni: rancher/calico-cni:v3.8.1
  calico_controllers: rancher/calico-kube-controllers:v3.8.1
  calico_ctl: ""
  calico_flexvol: rancher/calico-pod2daemon-flexvol:v3.8.1
  canal_node: rancher/calico-node:v3.8.1
  canal_cni: rancher/calico-cni:v3.8.1
  canal_flannel: rancher/coreos-flannel:v0.11.0
  canal_flexvol: rancher/calico-pod2daemon-flexvol:v3.8.1
  weave_node: weaveworks/weave-kube:2.5.2
  weave_cni: weaveworks/weave-npc:2.5.2
  pod_infra_container: rancher/pause:3.1
  ingress: rancher/nginx-ingress-controller:nginx-0.25.1-rancher1
  ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1
  metrics_server: rancher/metrics-server:v0.3.4
  windows_pod_infra_container: rancher/kubelet-pause:v0.1.3
ssh_key_path: ~/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
  mode: rbac
  options: {}
ignore_docker_version: false
kubernetes_version: ""
private_registries:
    - url: artifactory.mycorp.com
      username: epederson
      password: password
      is_default: true
ingress:
  provider: ""
  options: {}
  node_selector: {}
  extra_args: {}
  dns_policy: ""
  extra_envs: []
  extra_volumes: []
  extra_volume_mounts: []
cluster_name: ""
cloud_provider:
  name: ""
prefix_path: ""
addon_job_timeout: 0
bastion_host:
  address: ""
  port: ""
  user: ""
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
monitoring:
  provider: ""
  options: {}
  node_selector: {}
restore:
  restore: false
  snapshot_name: ""
dns: null

Steps to Reproduce: rke up with given cluster.yml with private_registries is_default: true.

Results: Ignores private registry, tries to pull from docker.io instead. This environment does not have access to outside world.

INFO[0135] Pulling image [rancher/hyperkube:v1.16.3-rancher1] on host [10.170.228.66], try  #3

WARN[0150] Can't pull Docker image [rancher/hyperkube:v1.16.3-rancher1] on host [10.170.228.66]: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
DEBU[0150] [pre-deploy] Can't pull Docker image [rancher/hyperkube:v1.16.3-rancher1] on host [10.170.228.66]: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
WARN[0150] [pre-deploy] Can't pull Docker image [rancher/hyperkube:v1.16.3-rancher1] on host [10.170.228.66]: Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
FATA[0150] [Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)]
sourcedelica commented 4 years ago

Even though the docs for the Default Registry say that

In this example RKE will use registry.com as the default registry for all system images, e.g. rancher/rke-tools:v0.1.14 will become registry.com/rancher/rke-tools:v0.1.14

and Air-gapped Setups says

Prior to v0.1.10, you had to configure your private registry credentials and update the names of all the system images in the cluster.yml so that the image names would have the private registry URL appended before each image name.

That does not work.

If I change the name of the images in cluster.yml to prepend my private registry then it does work.

papdaniel commented 4 years ago

@sourcedelica I didn't understand either, then I checked the source code.. If you have system_images filled with images, rke will use them, just as they appear there, and doesn't care about private_registry. So if you use this:

system_images:
  rancher/coreos-etcd:v3.4.3-rancher1

It will pull it from docker.io regardless of private_registry. If you remove all system images from cluster.yml and define private_registry rke will pull the default (rke config --system-images) images from your registry.

So we can't use non-default images from private registry with private_registry. If you want to use different images from private registry you have to put them into system_images with full path, like: my.private.registry/rancher/coreos-etcd:v3.4.3-rancher1

Thulium-Drake commented 4 years ago

The documentation then is really confusing... In my current setup I have a Foreman server acting as the registry. First I configured the Foreman server as the private registry and edited the system-images to contain the naming schedule Foreman uses, but I kept getting errors about the image either didn't exist of the registry required authentication.

@papdaniel @sourcedelica your comments actually helped me get it up and running, thanks!

But either the docs need to be fixed, or the code :-)

EDIT: oh, and it also seems pulling from authenticated private repos does not work then...

immanuelfodor commented 4 years ago

Maybe related to https://github.com/rancher/rke/issues/2228 ?

stale[bot] commented 3 years ago

This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

HectorB-2020 commented 1 year ago

@Thulium-Drake, @carloscarnero and @papdaniel, may I ask if reached full understanding how private registry works in air-gapped environment? I've been studying the documentation and its current version says:

Default Registry As of v0.1.10, RKE supports specifying a default registry from the list of private registries to be used with all system images. In this example, RKE will use registry.com as the default registry for all system images, e.g. rancher/rke-tools:v0.1.14 will become registry.com/rancher/rke-tools:v0.1.14.

Air-gapped Setups By default, all system images are being pulled from DockerHub. If you are on a system that does not have access to DockerHub, you will need to create a private registry that is populated with all the required system images.

As of v0.1.10, you have to configure your private registry credentials, but you can specify this registry as a default registry so that all system images are pulled from the designated private registry. You can use the command rke config --system-images to get the list of default system images to populate your private registry.

That gave me impression that I don't have to adjust section system-images in my cluster.yml. But when I examine docker inspect and kubectl get po -o yaml in recently deployed cluster they don't mention my registry anywhere.

So, I'm perplexed if have to explicitly point to my registry or I'm merely missing an obvious some command line parameter to address such case? BTW, I've got a couple of other questions.

  1. Have you worked with http (w/o 's')? Is insecure-registries sufficient here?
  2. Is it possible to use the well-known Docker Registry v2 without authentication? Looks like the documentation insists on this: you have to configure your private registry credentials.