rancher / rke

Rancher Kubernetes Engine (RKE), an extremely simple, lightning fast Kubernetes distribution that runs entirely within containers.
Apache License 2.0
3.2k stars 580 forks source link

Upgrade breaks Ingress functionality #2975

Closed paddy-hack closed 2 years ago

paddy-hack commented 2 years ago

RKE version: Upgrading from v1.3.10 to v1.3.12

Docker version: (docker version,docker info preferred)

docker info
[rancher@rancher ~]$ docker info
Client:
 Debug Mode: false
Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 19.03.15
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ea765aba0d05254012b0b9e595e995c09186427f
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.14.138-rancher
 Operating System: RancherOS v1.5.8
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 1.949GiB
 Name: rancher
 ID: 32VX:IHO5:3RYD:AFET:33PV:ZV7Y:7WX6:SSGO:RJ5B:LUYX:ZRHQ:YU5R
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http://proxy.example.org:8080
 No Proxy: localhost,127.0.0.1,::1,.example.org
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  http://docker-registry.mirror.example.org/
 Live Restore Enabled: false
 Product License: Community Engine

docker version
[rancher@rancher ~]$ docker version
Client: Docker Engine - Community
 Version:           19.03.15
 API version:       1.40
 Go version:        go1.13.15
 Git commit:        99e3ed8
 Built:             Sat Jan 30 03:11:43 2021
 OS/Arch:           linux/amd64
 Experimental:      false
Server: Docker Engine - Community
 Engine:
  Version:          19.03.15
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       99e3ed8
  Built:            Sat Jan 30 03:18:13 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.3.9
  GitCommit:        ea765aba0d05254012b0b9e595e995c09186427f
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

Using RancherOS v1.5.8

uname -r
[rancher@rancher ~]$ uname -r
4.14.138-rancher
cat /etc/os-release
[rancher@rancher ~]$ cat /etc/os-release
NAME="RancherOS"
VERSION=v1.5.8
ID=rancheros
ID_LIKE=
VERSION_ID=v1.5.8
PRETTY_NAME="RancherOS v1.5.8"
HOME_URL="http://rancher.com/rancher-os/"
SUPPORT_URL="https://forums.rancher.com/c/rancher-os"
BUG_REPORT_URL="https://github.com/rancher/os/issues"
BUILD_ID=

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)

Using QEMU (qemu-system-x86_64), version 7.0.0, on Devuan GNU/Linux daedalus. The node VMs are started with

sudo kvm -m 2048 -smp 1 -rtc base=utc,clock=rt -cpu host \
    -nic tap,mac=DE:AD:BE:EF:11:6$i,model=virtio-net-pci,helper=/usr/lib/qemu/qemu-bridge-helper \
    -hda image-$i.qcow2

Cluster info

cluster.yml
nodes:
  - address: 10.11.17.208
    user: rancher
    role:
      - controlplane
      - etcd
  - address: 10.11.17.209
    user: rancher
    role:
      - worker
manifest.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: welcome
spec:
  replicas: 1
  selector:
    matchLabels:
      app: welcome-nginx
  template:
    metadata:
      name: welcome
      labels:
        app: welcome-nginx
    spec:
      containers:
        - image: nginx:1.21.6-alpine
          name: nginx
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: welcome
spec:
  ports:
    - port: 80
      targetPort: 80
  selector:
    app: welcome-nginx
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: welcome
spec:
  rules:
    - host: welcome.example.org
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: welcome
                port:
                  number: 80

Steps to Reproduce:

rke_linux-amd64-v1.3.10 up --ssh-agent-auth
export KUBECONFIG=kube_config_cluster.yml
kubectl create ns welcome
kubectl -n welcome apply -f manifest.yml
kubectl -n welcome get ingress
wget --no-proxy -qO- welcome.example.org
Log of the RKE v1.3.10 invocation
INFO[0000] Running RKE version: v1.3.10                 
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates 
INFO[0000] [certificates] Generating Kubernetes API server certificates 
INFO[0000] [certificates] Generating admin certificates and kubeconfig 
INFO[0000] [certificates] Generating kube-etcd-10-11-17-208 certificate and key 
INFO[0000] Successfully Deployed state file at [./cluster.rkestate] 
INFO[0000] Building Kubernetes cluster                  
INFO[0000] [dialer] Setup tunnel for host [10.11.17.209] 
INFO[0000] [dialer] Setup tunnel for host [10.11.17.208] 
INFO[0000] [network] Deploying port listener containers 
INFO[0000] Pulling image [rancher/rke-tools:v0.1.80] on host [10.11.17.208], try rancher/rke#1 
INFO[0164] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0165] Starting container [rke-etcd-port-listener] on host [10.11.17.208], try rancher/rke#1 
INFO[0165] [network] Successfully started [rke-etcd-port-listener] container on host [10.11.17.208] 
INFO[0166] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0166] Starting container [rke-cp-port-listener] on host [10.11.17.208], try rancher/rke#1 
INFO[0167] [network] Successfully started [rke-cp-port-listener] container on host [10.11.17.208] 
INFO[0167] Pulling image [rancher/rke-tools:v0.1.80] on host [10.11.17.209], try rancher/rke#1 
INFO[0349] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0350] Starting container [rke-worker-port-listener] on host [10.11.17.209], try rancher/rke#1 
INFO[0350] [network] Successfully started [rke-worker-port-listener] container on host [10.11.17.209] 
INFO[0350] [network] Port listener containers deployed successfully 
INFO[0350] [network] Running control plane -> etcd port checks 
INFO[0350] [network] Checking if host [10.11.17.208] can connect to host(s) [10.11.17.208] on port(s) [2379], try rancher/rke#1 
INFO[0351] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0351] Starting container [rke-port-checker] on host [10.11.17.208], try rancher/rke#1 
INFO[0351] [network] Successfully started [rke-port-checker] container on host [10.11.17.208] 
INFO[0351] Removing container [rke-port-checker] on host [10.11.17.208], try rancher/rke#1 
INFO[0352] [network] Running control plane -> worker port checks 
INFO[0352] [network] Checking if host [10.11.17.208] can connect to host(s) [10.11.17.209] on port(s) [10250], try rancher/rke#1 
INFO[0352] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0352] Starting container [rke-port-checker] on host [10.11.17.208], try rancher/rke#1 
INFO[0352] [network] Successfully started [rke-port-checker] container on host [10.11.17.208] 
INFO[0352] Removing container [rke-port-checker] on host [10.11.17.208], try rancher/rke#1 
INFO[0352] [network] Running workers -> control plane port checks 
INFO[0352] [network] Checking if host [10.11.17.209] can connect to host(s) [10.11.17.208] on port(s) [6443], try rancher/rke#1 
INFO[0352] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0353] Starting container [rke-port-checker] on host [10.11.17.209], try rancher/rke#1 
INFO[0353] [network] Successfully started [rke-port-checker] container on host [10.11.17.209] 
INFO[0353] Removing container [rke-port-checker] on host [10.11.17.209], try rancher/rke#1 
INFO[0353] [network] Checking KubeAPI port Control Plane hosts 
INFO[0353] [network] Removing port listener containers  
INFO[0353] Removing container [rke-etcd-port-listener] on host [10.11.17.208], try rancher/rke#1 
INFO[0354] [remove/rke-etcd-port-listener] Successfully removed container on host [10.11.17.208] 
INFO[0354] Removing container [rke-cp-port-listener] on host [10.11.17.208], try rancher/rke#1 
INFO[0354] [remove/rke-cp-port-listener] Successfully removed container on host [10.11.17.208] 
INFO[0354] Removing container [rke-worker-port-listener] on host [10.11.17.209], try rancher/rke#1 
INFO[0354] [remove/rke-worker-port-listener] Successfully removed container on host [10.11.17.209] 
INFO[0354] [network] Port listener containers removed successfully 
INFO[0354] [certificates] Deploying kubernetes certificates to Cluster nodes 
INFO[0354] Checking if container [cert-deployer] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0354] Checking if container [cert-deployer] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0354] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0354] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0355] Starting container [cert-deployer] on host [10.11.17.209], try rancher/rke#1 
INFO[0355] Starting container [cert-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0355] Checking if container [cert-deployer] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0355] Checking if container [cert-deployer] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0360] Checking if container [cert-deployer] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0360] Checking if container [cert-deployer] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0360] Removing container [cert-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0360] Removing container [cert-deployer] on host [10.11.17.209], try rancher/rke#1 
INFO[0360] [reconcile] Rebuilding and updating local kube config 
INFO[0360] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml] 
WARN[0360] [reconcile] host [10.11.17.208] is a control plane node without reachable Kubernetes API endpoint in the cluster 
WARN[0360] [reconcile] no control plane node with reachable Kubernetes API endpoint in the cluster found 
INFO[0360] [certificates] Successfully deployed kubernetes certificates to Cluster nodes 
INFO[0360] [file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [10.11.17.208] 
INFO[0360] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0361] Starting container [file-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0361] Successfully started [file-deployer] container on host [10.11.17.208] 
INFO[0361] Waiting for [file-deployer] container to exit on host [10.11.17.208] 
INFO[0361] Waiting for [file-deployer] container to exit on host [10.11.17.208] 
INFO[0361] Removing container [file-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0361] [remove/file-deployer] Successfully removed container on host [10.11.17.208] 
INFO[0361] [/etc/kubernetes/audit-policy.yaml] Successfully deployed audit policy file to Cluster control nodes 
INFO[0361] [reconcile] Reconciling cluster state        
INFO[0361] [reconcile] This is newly generated cluster  
INFO[0361] Pre-pulling kubernetes images                
INFO[0361] Pulling image [rancher/hyperkube:v1.22.9-rancher1] on host [10.11.17.208], try rancher/rke#1 
INFO[0361] Pulling image [rancher/hyperkube:v1.22.9-rancher1] on host [10.11.17.209], try rancher/rke#1 
INFO[1587] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.208] 
INFO[1602] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.209] 
INFO[1602] Kubernetes images pulled successfully        
INFO[1602] [etcd] Building up etcd plane..              
INFO[1602] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1603] Starting container [etcd-fix-perm] on host [10.11.17.208], try rancher/rke#1 
INFO[1603] Successfully started [etcd-fix-perm] container on host [10.11.17.208] 
INFO[1603] Waiting for [etcd-fix-perm] container to exit on host [10.11.17.208] 
INFO[1603] Waiting for [etcd-fix-perm] container to exit on host [10.11.17.208] 
INFO[1603] Removing container [etcd-fix-perm] on host [10.11.17.208], try rancher/rke#1 
INFO[1603] [remove/etcd-fix-perm] Successfully removed container on host [10.11.17.208] 
INFO[1603] Pulling image [rancher/mirrored-coreos-etcd:v3.5.3] on host [10.11.17.208], try rancher/rke#1 
INFO[1822] Image [rancher/mirrored-coreos-etcd:v3.5.3] exists on host [10.11.17.208] 
INFO[1823] Starting container [etcd] on host [10.11.17.208], try rancher/rke#1 
INFO[1823] [etcd] Successfully started [etcd] container on host [10.11.17.208] 
INFO[1823] [etcd] Running rolling snapshot container [etcd-snapshot-once] on host [10.11.17.208] 
INFO[1823] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1824] Starting container [etcd-rolling-snapshots] on host [10.11.17.208], try rancher/rke#1 
INFO[1824] [etcd] Successfully started [etcd-rolling-snapshots] container on host [10.11.17.208] 
INFO[1829] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1830] Starting container [rke-bundle-cert] on host [10.11.17.208], try rancher/rke#1 
INFO[1830] [certificates] Successfully started [rke-bundle-cert] container on host [10.11.17.208] 
INFO[1830] Waiting for [rke-bundle-cert] container to exit on host [10.11.17.208] 
INFO[1830] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [10.11.17.208] 
INFO[1830] Removing container [rke-bundle-cert] on host [10.11.17.208], try rancher/rke#1 
INFO[1830] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1831] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1831] [etcd] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1831] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1831] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1831] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1832] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1832] [etcd] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1832] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1832] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1832] [etcd] Successfully started etcd plane.. Checking etcd cluster health 
INFO[1832] [etcd] etcd host [10.11.17.208] reported healthy=true 
INFO[1832] [controlplane] Building up Controller Plane.. 
INFO[1832] Checking if container [service-sidekick] is running on host [10.11.17.208], try rancher/rke#1 
INFO[1832] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1833] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.208] 
INFO[1833] Starting container [kube-apiserver] on host [10.11.17.208], try rancher/rke#1 
INFO[1833] [controlplane] Successfully started [kube-apiserver] container on host [10.11.17.208] 
INFO[1833] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [10.11.17.208] 
INFO[1841] [healthcheck] service [kube-apiserver] on host [10.11.17.208] is healthy 
INFO[1841] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1841] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1842] [controlplane] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1842] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1842] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1842] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.208] 
INFO[1842] Starting container [kube-controller-manager] on host [10.11.17.208], try rancher/rke#1 
INFO[1842] [controlplane] Successfully started [kube-controller-manager] container on host [10.11.17.208] 
INFO[1842] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [10.11.17.208] 
INFO[1848] [healthcheck] service [kube-controller-manager] on host [10.11.17.208] is healthy 
INFO[1848] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1848] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1849] [controlplane] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1849] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1849] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1849] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.208] 
INFO[1849] Starting container [kube-scheduler] on host [10.11.17.208], try rancher/rke#1 
INFO[1849] [controlplane] Successfully started [kube-scheduler] container on host [10.11.17.208] 
INFO[1849] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [10.11.17.208] 
INFO[1854] [healthcheck] service [kube-scheduler] on host [10.11.17.208] is healthy 
INFO[1854] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1855] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1855] [controlplane] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1855] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1856] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1856] [controlplane] Successfully started Controller Plane.. 
INFO[1856] Using proxy environment variable http_proxy with value [http://proxy.example.org:8080] 
INFO[1856] Using proxy environment variable https_proxy with value [http://proxy.example.org:8080] 
INFO[1856] Using proxy environment variable no_proxy with value [localhost,127.0.0.1,::1,10.0.0.0/8,.example.org] 
INFO[1856] [authz] Creating rke-job-deployer ServiceAccount 
INFO[1856] [authz] rke-job-deployer ServiceAccount created successfully 
INFO[1856] [authz] Creating system:node ClusterRoleBinding 
INFO[1856] [authz] system:node ClusterRoleBinding created successfully 
INFO[1856] [authz] Creating kube-apiserver proxy ClusterRole and ClusterRoleBinding 
INFO[1856] [authz] kube-apiserver proxy ClusterRole and ClusterRoleBinding created successfully 
INFO[1856] Successfully Deployed state file at [./cluster.rkestate] 
INFO[1856] [state] Saving full cluster state to Kubernetes 
INFO[1856] [state] Successfully Saved full cluster state to Kubernetes ConfigMap: full-cluster-state 
INFO[1856] [worker] Building up Worker Plane..          
INFO[1856] Checking if container [service-sidekick] is running on host [10.11.17.208], try rancher/rke#1 
INFO[1856] [sidekick] Sidekick container already created on host [10.11.17.208] 
INFO[1856] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.208] 
INFO[1856] Starting container [kubelet] on host [10.11.17.208], try rancher/rke#1 
INFO[1856] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[1856] [worker] Successfully started [kubelet] container on host [10.11.17.208] 
INFO[1856] [healthcheck] Start Healthcheck on service [kubelet] on host [10.11.17.208] 
INFO[1856] Starting container [nginx-proxy] on host [10.11.17.209], try rancher/rke#1 
INFO[1857] [worker] Successfully started [nginx-proxy] container on host [10.11.17.209] 
INFO[1857] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[1857] Starting container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[1858] [worker] Successfully started [rke-log-linker] container on host [10.11.17.209] 
INFO[1858] Removing container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[1858] [remove/rke-log-linker] Successfully removed container on host [10.11.17.209] 
INFO[1858] Checking if container [service-sidekick] is running on host [10.11.17.209], try rancher/rke#1 
INFO[1858] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[1858] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.209] 
INFO[1858] Starting container [kubelet] on host [10.11.17.209], try rancher/rke#1 
INFO[1858] [worker] Successfully started [kubelet] container on host [10.11.17.209] 
INFO[1858] [healthcheck] Start Healthcheck on service [kubelet] on host [10.11.17.209] 
INFO[1861] [healthcheck] service [kubelet] on host [10.11.17.208] is healthy 
INFO[1861] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1862] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1862] [worker] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1862] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1862] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1862] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.208] 
INFO[1863] Starting container [kube-proxy] on host [10.11.17.208], try rancher/rke#1 
INFO[1863] [worker] Successfully started [kube-proxy] container on host [10.11.17.208] 
INFO[1863] [healthcheck] Start Healthcheck on service [kube-proxy] on host [10.11.17.208] 
INFO[1864] [healthcheck] service [kubelet] on host [10.11.17.209] is healthy 
INFO[1864] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[1864] Starting container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[1865] [worker] Successfully started [rke-log-linker] container on host [10.11.17.209] 
INFO[1865] Removing container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[1865] [remove/rke-log-linker] Successfully removed container on host [10.11.17.209] 
INFO[1865] Image [rancher/hyperkube:v1.22.9-rancher1] exists on host [10.11.17.209] 
INFO[1865] Starting container [kube-proxy] on host [10.11.17.209], try rancher/rke#1 
INFO[1865] [worker] Successfully started [kube-proxy] container on host [10.11.17.209] 
INFO[1865] [healthcheck] Start Healthcheck on service [kube-proxy] on host [10.11.17.209] 
INFO[1868] [healthcheck] service [kube-proxy] on host [10.11.17.208] is healthy 
INFO[1868] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1869] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1869] [worker] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[1869] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[1869] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[1870] [healthcheck] service [kube-proxy] on host [10.11.17.209] is healthy 
INFO[1870] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[1871] Starting container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[1871] [worker] Successfully started [rke-log-linker] container on host [10.11.17.209] 
INFO[1871] Removing container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[1872] [remove/rke-log-linker] Successfully removed container on host [10.11.17.209] 
INFO[1872] [worker] Successfully started Worker Plane.. 
INFO[1872] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[1872] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[1872] Starting container [rke-log-cleaner] on host [10.11.17.209], try rancher/rke#1 
INFO[1872] Starting container [rke-log-cleaner] on host [10.11.17.208], try rancher/rke#1 
INFO[1872] [cleanup] Successfully started [rke-log-cleaner] container on host [10.11.17.208] 
INFO[1872] [cleanup] Successfully started [rke-log-cleaner] container on host [10.11.17.209] 
INFO[1872] Removing container [rke-log-cleaner] on host [10.11.17.208], try rancher/rke#1 
INFO[1872] Removing container [rke-log-cleaner] on host [10.11.17.209], try rancher/rke#1 
INFO[1873] [remove/rke-log-cleaner] Successfully removed container on host [10.11.17.209] 
INFO[1873] [remove/rke-log-cleaner] Successfully removed container on host [10.11.17.208] 
INFO[1873] [sync] Syncing nodes Labels and Taints       
INFO[1873] [sync] Successfully synced nodes Labels and Taints 
INFO[1873] [network] Setting up network plugin: canal   
INFO[1873] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes 
INFO[1873] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes 
INFO[1873] [addons] Executing deploy job rke-network-plugin 
INFO[1908] [addons] Setting up coredns                  
INFO[1908] [addons] Saving ConfigMap for addon rke-coredns-addon to Kubernetes 
INFO[1908] [addons] Successfully saved ConfigMap for addon rke-coredns-addon to Kubernetes 
INFO[1908] [addons] Executing deploy job rke-coredns-addon 
INFO[1918] [addons] CoreDNS deployed successfully       
INFO[1918] [dns] DNS provider coredns deployed successfully 
INFO[1918] [addons] Setting up Metrics Server           
INFO[1918] [addons] Saving ConfigMap for addon rke-metrics-addon to Kubernetes 
INFO[1918] [addons] Successfully saved ConfigMap for addon rke-metrics-addon to Kubernetes 
INFO[1918] [addons] Executing deploy job rke-metrics-addon 
INFO[1923] [addons] Metrics Server deployed successfully 
INFO[1923] [ingress] Setting up nginx ingress controller 
INFO[1923] [ingress] removing admission batch jobs if they exist 
INFO[1923] [addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes 
INFO[1923] [addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes 
INFO[1923] [addons] Executing deploy job rke-ingress-controller 
INFO[1933] [ingress] removing default backend service and deployment if they exist 
INFO[1934] [ingress] ingress controller nginx deployed successfully 
INFO[1934] [addons] Setting up user addons              
INFO[1934] [addons] no user addons defined              
INFO[1934] Finished building Kubernetes cluster successfully 

Adjust IP addresses in cluster.yml to match your VMs. Wait for all pods to be Running or Completed after the rke up invocation. Also wait for the Ingress to get an IP address. It should get the IP address of the sole worker node.

$ kubectl -n welcome get ingress
NAME      CLASS   HOSTS                 ADDRESS        PORTS   AGE
welcome   nginx   welcome.example.org   10.11.17.209   80      2m21s
kubectl -n welcome describe ingress
$ kubectl -n welcome describe ingress 
Name:             welcome
Labels:           <none>
Namespace:        welcome
Address:          10.11.17.209
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
  Host                 Path  Backends
  ----                 ----  --------
  welcome.example.org  
                       /   welcome:80 (10.42.1.7:80)
Annotations:           <none>
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    16m (x2 over 17m)  nginx-ingress-controller  Scheduled for sync

Add that IP address to your /etc/hosts with a hostname of welcome.example.org. After that the wget command returns the Nginx welcome page.
So far, so good.

Now upgrade to RKE v1.3.12, wait for all pods to be Running or Completed and re-run the wget command. It should return the Nginx welcome page again.

rke_linux-amd64-v1.3.12 up --ssh-agent-auth
wget --no-proxy -qO- welcome.example.org
Log of the RKE v1.3.12 invocation
INFO[0000] Running RKE version: v1.3.12                 
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates 
INFO[0000] [certificates] Generating admin certificates and kubeconfig 
INFO[0000] Successfully Deployed state file at [./cluster.rkestate] 
INFO[0000] Building Kubernetes cluster                  
INFO[0000] [dialer] Setup tunnel for host [10.11.17.209] 
INFO[0000] [dialer] Setup tunnel for host [10.11.17.208] 
INFO[0000] [network] No hosts added existing cluster, skipping port check 
INFO[0000] [certificates] Deploying kubernetes certificates to Cluster nodes 
INFO[0000] Checking if container [cert-deployer] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0000] Checking if container [cert-deployer] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0000] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0000] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0001] Starting container [cert-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0001] Starting container [cert-deployer] on host [10.11.17.209], try rancher/rke#1 
INFO[0001] Checking if container [cert-deployer] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0001] Checking if container [cert-deployer] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0006] Checking if container [cert-deployer] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0006] Checking if container [cert-deployer] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0006] Removing container [cert-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0006] Removing container [cert-deployer] on host [10.11.17.209], try rancher/rke#1 
INFO[0006] [reconcile] Rebuilding and updating local kube config 
INFO[0006] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml] 
INFO[0006] [reconcile] host [10.11.17.208] is a control plane node with reachable Kubernetes API endpoint in the cluster 
INFO[0006] [certificates] Successfully deployed kubernetes certificates to Cluster nodes 
INFO[0006] [file-deploy] Deploying file [/etc/kubernetes/audit-policy.yaml] to node [10.11.17.208] 
INFO[0007] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0007] Starting container [file-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0007] Successfully started [file-deployer] container on host [10.11.17.208] 
INFO[0007] Waiting for [file-deployer] container to exit on host [10.11.17.208] 
INFO[0007] Waiting for [file-deployer] container to exit on host [10.11.17.208] 
INFO[0007] Container [file-deployer] is still running on host [10.11.17.208]: stderr: [], stdout: [] 
INFO[0008] Waiting for [file-deployer] container to exit on host [10.11.17.208] 
INFO[0008] Removing container [file-deployer] on host [10.11.17.208], try rancher/rke#1 
INFO[0008] [remove/file-deployer] Successfully removed container on host [10.11.17.208] 
INFO[0008] [/etc/kubernetes/audit-policy.yaml] Successfully deployed audit policy file to Cluster control nodes 
INFO[0008] [reconcile] Reconciling cluster state        
INFO[0008] [reconcile] Check etcd hosts to be deleted   
INFO[0008] [reconcile] Check etcd hosts to be added     
INFO[0008] [reconcile] Rebuilding and updating local kube config 
INFO[0008] Successfully Deployed local admin kubeconfig at [./kube_config_cluster.yml] 
INFO[0008] [reconcile] host [10.11.17.208] is a control plane node with reachable Kubernetes API endpoint in the cluster 
INFO[0008] [reconcile] Reconciled cluster state successfully 
INFO[0008] max_unavailable_worker got rounded down to 0, resetting to 1 
INFO[0008] Setting maxUnavailable for worker nodes to: 1 
INFO[0008] Setting maxUnavailable for controlplane nodes to: 1 
INFO[0008] Pre-pulling kubernetes images                
INFO[0008] Pulling image [rancher/hyperkube:v1.23.7-rancher1] on host [10.11.17.208], try rancher/rke#1 
INFO[0008] Pulling image [rancher/hyperkube:v1.23.7-rancher1] on host [10.11.17.209], try rancher/rke#1 
INFO[0416] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.208] 
INFO[0450] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.209] 
INFO[0450] Kubernetes images pulled successfully        
INFO[0450] [etcd] Building up etcd plane..              
INFO[0451] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0451] Starting container [etcd-fix-perm] on host [10.11.17.208], try rancher/rke#1 
INFO[0452] Successfully started [etcd-fix-perm] container on host [10.11.17.208] 
INFO[0452] Waiting for [etcd-fix-perm] container to exit on host [10.11.17.208] 
INFO[0452] Waiting for [etcd-fix-perm] container to exit on host [10.11.17.208] 
INFO[0452] Removing container [etcd-fix-perm] on host [10.11.17.208], try rancher/rke#1 
INFO[0452] [remove/etcd-fix-perm] Successfully removed container on host [10.11.17.208] 
INFO[0452] [etcd] Running rolling snapshot container [etcd-snapshot-once] on host [10.11.17.208] 
INFO[0452] Removing container [etcd-rolling-snapshots] on host [10.11.17.208], try rancher/rke#1 
INFO[0452] [remove/etcd-rolling-snapshots] Successfully removed container on host [10.11.17.208] 
INFO[0452] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0453] Starting container [etcd-rolling-snapshots] on host [10.11.17.208], try rancher/rke#1 
INFO[0453] [etcd] Successfully started [etcd-rolling-snapshots] container on host [10.11.17.208] 
INFO[0458] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0458] Starting container [rke-bundle-cert] on host [10.11.17.208], try rancher/rke#1 
INFO[0459] [certificates] Successfully started [rke-bundle-cert] container on host [10.11.17.208] 
INFO[0459] Waiting for [rke-bundle-cert] container to exit on host [10.11.17.208] 
INFO[0459] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [10.11.17.208] 
INFO[0459] Removing container [rke-bundle-cert] on host [10.11.17.208], try rancher/rke#1 
INFO[0459] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0459] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0460] [etcd] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0460] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0460] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0460] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0461] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0461] [etcd] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0461] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0461] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0461] [etcd] Successfully started etcd plane.. Checking etcd cluster health 
INFO[0461] [etcd] etcd host [10.11.17.208] reported healthy=true 
INFO[0461] [controlplane] Now checking status of node 10.11.17.208, try rancher/rke#1 
INFO[0462] [controlplane] Processing controlplane hosts for upgrade 1 at a time 
INFO[0462] Processing controlplane host 10.11.17.208    
INFO[0462] [controlplane] Now checking status of node 10.11.17.208, try rancher/rke#1 
INFO[0462] [controlplane] Getting list of nodes for upgrade 
INFO[0462] Upgrading controlplane components for control host 10.11.17.208 
INFO[0462] Checking if container [service-sidekick] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0462] [sidekick] Sidekick container already created on host [10.11.17.208] 
INFO[0462] Checking if container [kube-apiserver] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0462] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.208] 
INFO[0462] Checking if container [old-kube-apiserver] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0462] Stopping container [kube-apiserver] on host [10.11.17.208] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0467] Waiting for [kube-apiserver] container to exit on host [10.11.17.208] 
INFO[0467] Renaming container [kube-apiserver] to [old-kube-apiserver] on host [10.11.17.208], try rancher/rke#1 
INFO[0467] Starting container [kube-apiserver] on host [10.11.17.208], try rancher/rke#1 
INFO[0468] [controlplane] Successfully updated [kube-apiserver] container on host [10.11.17.208] 
INFO[0468] Removing container [old-kube-apiserver] on host [10.11.17.208], try rancher/rke#1 
INFO[0468] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [10.11.17.208] 
INFO[0479] [healthcheck] service [kube-apiserver] on host [10.11.17.208] is healthy 
INFO[0479] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0479] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0480] [controlplane] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0480] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0480] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0480] Checking if container [kube-controller-manager] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0480] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.208] 
INFO[0480] Checking if container [old-kube-controller-manager] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0480] Stopping container [kube-controller-manager] on host [10.11.17.208] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0480] Waiting for [kube-controller-manager] container to exit on host [10.11.17.208] 
INFO[0480] Renaming container [kube-controller-manager] to [old-kube-controller-manager] on host [10.11.17.208], try rancher/rke#1 
INFO[0480] Starting container [kube-controller-manager] on host [10.11.17.208], try rancher/rke#1 
INFO[0480] [controlplane] Successfully updated [kube-controller-manager] container on host [10.11.17.208] 
INFO[0480] Removing container [old-kube-controller-manager] on host [10.11.17.208], try rancher/rke#1 
INFO[0480] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [10.11.17.208] 
INFO[0486] [healthcheck] service [kube-controller-manager] on host [10.11.17.208] is healthy 
INFO[0486] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0486] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0487] [controlplane] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0487] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0487] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0487] Checking if container [kube-scheduler] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0487] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.208] 
INFO[0487] Checking if container [old-kube-scheduler] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0487] Stopping container [kube-scheduler] on host [10.11.17.208] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0487] Waiting for [kube-scheduler] container to exit on host [10.11.17.208] 
INFO[0487] Renaming container [kube-scheduler] to [old-kube-scheduler] on host [10.11.17.208], try rancher/rke#1 
INFO[0487] Starting container [kube-scheduler] on host [10.11.17.208], try rancher/rke#1 
INFO[0487] [controlplane] Successfully updated [kube-scheduler] container on host [10.11.17.208] 
INFO[0487] Removing container [old-kube-scheduler] on host [10.11.17.208], try rancher/rke#1 
INFO[0487] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [10.11.17.208] 
INFO[0493] [healthcheck] service [kube-scheduler] on host [10.11.17.208] is healthy 
INFO[0493] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0493] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0493] [controlplane] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0494] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0494] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0494] Upgrading workerplane components for control host 10.11.17.208 
INFO[0494] Checking if container [service-sidekick] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0494] [sidekick] Sidekick container already created on host [10.11.17.208] 
INFO[0494] Checking if container [kubelet] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0494] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.208] 
INFO[0494] Checking if container [old-kubelet] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0494] Stopping container [kubelet] on host [10.11.17.208] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0494] Waiting for [kubelet] container to exit on host [10.11.17.208] 
INFO[0494] Renaming container [kubelet] to [old-kubelet] on host [10.11.17.208], try rancher/rke#1 
INFO[0494] Starting container [kubelet] on host [10.11.17.208], try rancher/rke#1 
INFO[0494] [worker] Successfully updated [kubelet] container on host [10.11.17.208] 
INFO[0494] Removing container [old-kubelet] on host [10.11.17.208], try rancher/rke#1 
INFO[0494] [healthcheck] Start Healthcheck on service [kubelet] on host [10.11.17.208] 
INFO[0500] [healthcheck] service [kubelet] on host [10.11.17.208] is healthy 
INFO[0500] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0500] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0500] [worker] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0501] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0501] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0501] Checking if container [kube-proxy] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0501] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.208] 
INFO[0501] Checking if container [old-kube-proxy] is running on host [10.11.17.208], try rancher/rke#1 
INFO[0501] Stopping container [kube-proxy] on host [10.11.17.208] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0501] Waiting for [kube-proxy] container to exit on host [10.11.17.208] 
INFO[0501] Renaming container [kube-proxy] to [old-kube-proxy] on host [10.11.17.208], try rancher/rke#1 
INFO[0501] Starting container [kube-proxy] on host [10.11.17.208], try rancher/rke#1 
INFO[0501] [worker] Successfully updated [kube-proxy] container on host [10.11.17.208] 
INFO[0501] Removing container [old-kube-proxy] on host [10.11.17.208], try rancher/rke#1 
INFO[0501] [healthcheck] Start Healthcheck on service [kube-proxy] on host [10.11.17.208] 
INFO[0502] [healthcheck] service [kube-proxy] on host [10.11.17.208] is healthy 
INFO[0502] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0502] Starting container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0502] [worker] Successfully started [rke-log-linker] container on host [10.11.17.208] 
INFO[0502] Removing container [rke-log-linker] on host [10.11.17.208], try rancher/rke#1 
INFO[0503] [remove/rke-log-linker] Successfully removed container on host [10.11.17.208] 
INFO[0503] [controlplane] Now checking status of node 10.11.17.208, try rancher/rke#1 
INFO[0503] [controlplane] Successfully upgraded Controller Plane.. 
INFO[0503] Using proxy environment variable http_proxy with value [http://proxy.example.org:8080] 
INFO[0503] Using proxy environment variable https_proxy with value [http://proxy.example.org:8080] 
INFO[0503] Using proxy environment variable no_proxy with value [localhost,127.0.0.1,::1,10.0.0.0/8,.example.org] 
INFO[0503] [authz] Creating rke-job-deployer ServiceAccount 
INFO[0503] [authz] rke-job-deployer ServiceAccount created successfully 
INFO[0503] [authz] Creating system:node ClusterRoleBinding 
INFO[0503] [authz] system:node ClusterRoleBinding created successfully 
INFO[0503] [authz] Creating kube-apiserver proxy ClusterRole and ClusterRoleBinding 
INFO[0503] [authz] kube-apiserver proxy ClusterRole and ClusterRoleBinding created successfully 
INFO[0503] Successfully Deployed state file at [./cluster.rkestate] 
INFO[0503] [state] Saving full cluster state to Kubernetes 
INFO[0503] [state] Successfully Saved full cluster state to Kubernetes ConfigMap: full-cluster-state 
INFO[0503] [worker] Now checking status of node 10.11.17.209, try rancher/rke#1 
INFO[0503] [worker] Upgrading Worker Plane..            
INFO[0503] Now checking and upgrading worker components on nodes with only worker role 1 at a time 
INFO[0503] [workerplane] Processing host 10.11.17.209   
INFO[0503] [worker] Now checking status of node 10.11.17.209, try rancher/rke#1 
INFO[0503] [worker] Getting list of nodes for upgrade   
INFO[0503] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0504] Starting container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[0504] [worker] Successfully started [rke-log-linker] container on host [10.11.17.209] 
INFO[0504] Removing container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[0504] [remove/rke-log-linker] Successfully removed container on host [10.11.17.209] 
INFO[0504] Checking if container [service-sidekick] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0504] [sidekick] Sidekick container already created on host [10.11.17.209] 
INFO[0504] Checking if container [kubelet] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0504] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.209] 
INFO[0504] Checking if container [old-kubelet] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0504] Stopping container [kubelet] on host [10.11.17.209] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0505] Waiting for [kubelet] container to exit on host [10.11.17.209] 
INFO[0505] Renaming container [kubelet] to [old-kubelet] on host [10.11.17.209], try rancher/rke#1 
INFO[0505] Starting container [kubelet] on host [10.11.17.209], try rancher/rke#1 
INFO[0505] [worker] Successfully updated [kubelet] container on host [10.11.17.209] 
INFO[0505] Removing container [old-kubelet] on host [10.11.17.209], try rancher/rke#1 
INFO[0505] [healthcheck] Start Healthcheck on service [kubelet] on host [10.11.17.209] 
INFO[0510] [healthcheck] service [kubelet] on host [10.11.17.209] is healthy 
INFO[0510] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0511] Starting container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[0511] [worker] Successfully started [rke-log-linker] container on host [10.11.17.209] 
INFO[0512] Removing container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[0512] [remove/rke-log-linker] Successfully removed container on host [10.11.17.209] 
INFO[0512] Checking if container [kube-proxy] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0512] Image [rancher/hyperkube:v1.23.7-rancher1] exists on host [10.11.17.209] 
INFO[0512] Checking if container [old-kube-proxy] is running on host [10.11.17.209], try rancher/rke#1 
INFO[0512] Stopping container [kube-proxy] on host [10.11.17.209] with stopTimeoutDuration [5s], try rancher/rke#1 
INFO[0512] Waiting for [kube-proxy] container to exit on host [10.11.17.209] 
INFO[0512] Renaming container [kube-proxy] to [old-kube-proxy] on host [10.11.17.209], try rancher/rke#1 
INFO[0512] Starting container [kube-proxy] on host [10.11.17.209], try rancher/rke#1 
INFO[0512] [worker] Successfully updated [kube-proxy] container on host [10.11.17.209] 
INFO[0512] Removing container [old-kube-proxy] on host [10.11.17.209], try rancher/rke#1 
INFO[0512] [healthcheck] Start Healthcheck on service [kube-proxy] on host [10.11.17.209] 
INFO[0517] [healthcheck] service [kube-proxy] on host [10.11.17.209] is healthy 
INFO[0517] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0518] Starting container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[0518] [worker] Successfully started [rke-log-linker] container on host [10.11.17.209] 
INFO[0518] Removing container [rke-log-linker] on host [10.11.17.209], try rancher/rke#1 
INFO[0518] [remove/rke-log-linker] Successfully removed container on host [10.11.17.209] 
INFO[0518] [worker] Now checking status of node 10.11.17.209, try rancher/rke#1 
INFO[0518] [worker] Successfully upgraded Worker Plane.. 
INFO[0518] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.209] 
INFO[0518] Image [rancher/rke-tools:v0.1.80] exists on host [10.11.17.208] 
INFO[0519] Starting container [rke-log-cleaner] on host [10.11.17.208], try rancher/rke#1 
INFO[0519] Starting container [rke-log-cleaner] on host [10.11.17.209], try rancher/rke#1 
INFO[0519] [cleanup] Successfully started [rke-log-cleaner] container on host [10.11.17.209] 
INFO[0519] Removing container [rke-log-cleaner] on host [10.11.17.209], try rancher/rke#1 
INFO[0519] [cleanup] Successfully started [rke-log-cleaner] container on host [10.11.17.208] 
INFO[0519] Removing container [rke-log-cleaner] on host [10.11.17.208], try rancher/rke#1 
INFO[0520] [remove/rke-log-cleaner] Successfully removed container on host [10.11.17.209] 
INFO[0520] [remove/rke-log-cleaner] Successfully removed container on host [10.11.17.208] 
INFO[0520] [sync] Syncing nodes Labels and Taints       
INFO[0520] [sync] Successfully synced nodes Labels and Taints 
INFO[0520] [network] Setting up network plugin: canal   
INFO[0520] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes 
INFO[0520] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes 
INFO[0520] [addons] Executing deploy job rke-network-plugin 
INFO[0535] [addons] Setting up coredns                  
INFO[0535] [addons] Saving ConfigMap for addon rke-coredns-addon to Kubernetes 
INFO[0535] [addons] Successfully saved ConfigMap for addon rke-coredns-addon to Kubernetes 
INFO[0535] [addons] Executing deploy job rke-coredns-addon 
INFO[0545] [addons] CoreDNS deployed successfully       
INFO[0545] [dns] DNS provider coredns deployed successfully 
INFO[0545] [addons] Setting up Metrics Server           
INFO[0545] [addons] Saving ConfigMap for addon rke-metrics-addon to Kubernetes 
INFO[0545] [addons] Successfully saved ConfigMap for addon rke-metrics-addon to Kubernetes 
INFO[0545] [addons] Executing deploy job rke-metrics-addon 
INFO[0555] [addons] Metrics Server deployed successfully 
INFO[0555] [ingress] Setting up nginx ingress controller 
INFO[0555] [ingress] removing admission batch jobs if they exist 
INFO[0555] [addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes 
INFO[0556] [addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes 
INFO[0556] [addons] Executing deploy job rke-ingress-controller 
INFO[0571] [ingress] removing default backend service and deployment if they exist 
INFO[0571] [ingress] ingress controller nginx deployed successfully 
INFO[0571] [addons] Setting up user addons              
INFO[0571] [addons] no user addons defined              
INFO[0571] Finished building Kubernetes cluster successfully 

Results:

The wget invocation does not return the Nginx welcome page. Instead it just sits there and I eventually killed it after about 20 minutes.

$ wget --no-proxy -qO- welcome.example.org
^C

The Ingress hasn't changed except for its age, of course, which is now older than all the Pods that the rke up deployed.

$  kubectl -n welcome get ingress
NAME      CLASS   HOSTS                 ADDRESS        PORTS   AGE
welcome   nginx   welcome.example.org   10.11.17.209   80      42m
kubectl -n welcome describe ingress
Name:             welcome
Labels:           
Namespace:        welcome
Address:          10.11.17.209
Default backend:  default-http-backend:80 ()
Rules:
  Host                 Path  Backends
  ----                 ----  --------
  welcome.example.org  
                       /   welcome:80 (10.42.1.7:80)
Annotations:           
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    48m (x2 over 49m)  nginx-ingress-controller  Scheduled for sync
  Normal  Sync    14m (x2 over 14m)  nginx-ingress-controller  Scheduled for sync

I can access the Nginx welcome page without trouble from Pods in the cluster's default namespace. All of the following return the expected page:

wget -qO- welcome.welcome  # by cluster-internal hostname
wget -qO- 10.42.1.7        # by Pod IP address
wget -qO- 10.43.73.238     # by Service ClusterIP

If I change the Service to use a NodePort, accessing the assigned port on any of the cluster's nodes also works as expected.

Removing the namespace and redeploying, like so

kubectl delete ns welcome
kubectl create ns welcome
kubectl -n welcome apply -f manifest.yml

does not change the situation either.

paddy-hack commented 2 years ago

Curious, rebooting the worker node fixes the Ingress ... :open_mouth:

I have no idea why the Ingress becomes non-functional as a result of the upgrade. I also have no idea how to prevent that from happening during the upgrade but it seems rebooting worker nodes after the upgrade makes the issue go away.

Additional testing with a multi-worker cluster has shown that you can reboot worker nodes one at a time and wait for the rebooting node to become Ready again. This also worked for me in multi-master node cluster.

You may not need to reboot all worker nodes. Even just rebooting worker nodes not associated with the Ingress fixed things for me.

timarandras commented 2 years ago

I've just experienced the same issue. Freshly installed RKE, with basic/default config. Everything was working fine, could create resources using kubectl (pods, svc, ingress, etc). RKE up also created the necessary nginx-ingress-controller on my (only) worker node.

So far so good.

Then created a pod: k run nginx --image=nginx

Exposed it via ClusterIP (and also with NodePort to ensure it works without ingress)

Edit svc.yaml to include nodePort: 30080 k apply -f svc.yaml

Check that NodePort works: curl <WORKER IP>:30080 // OK!

Then, created the ingress: k create ingress nginx --rule=/*=nginx:80

Check that I can reach the application using the ingress: curl <IP or DNS of WORKER>/ // ERROR: Connection refused

I've also checked that the nginx configuration inside nginx-ingress-controller got updated. k exec -it -n ingress-nginx nginx-ingress-controller-5564f -- bash Found that /etc/nginx/nginx.conf has been updated with the server {} portion matching the ingress resource created earlier. Good.

Also checked that curl-ing from another POD to nginx-ingress-controller-5564f pod port 80 is working, and it was.

It seemed that everything was in place but despite the nginx-ingress-controller daemonset configuration's hostPort settings, the hostPorts (80, 443 by default) haven't been exposed on the host/worker node.

After worker node restart (hosting the ingress controller) the issue is resolved. curl-ing from outside the cluster did work. Strange.

lopf commented 2 years ago

We've just experienced the same going from RKE 1.3.8 on K8s 1.21 to RKE 1.3.12 and K8s 1.22.

aendi123 commented 2 years ago

Just had the same problem going from RKE v1.3.11 with K8s 1.23.6 to RKE v1.3.12 with K8s 1.23.7

wzrdtales commented 1 year ago

there are hostPorts created now. weirdly they are only reachable from the node itself, not from anywhere else...

wzrdtales commented 1 year ago

restarting doesn't help at all