GunSik2 / k8s-ai

ai/bigdata/gpu examples with k8s
0 stars 0 forks source link

rke + rancher #6

Open GunSik2 opened 3 years ago

GunSik2 commented 3 years ago

Env : Ubuntu 16.04 (Docker 20.04 는 ubuntu 16.04 지원하지 않음)

br_netfilter

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

lsmod | grep br_netfilter

Firewall

#master
sudo ufw enable
sudo ufw allow 6443/tcp
sudo ufw allow 2379:2380/tcp
sudo ufw allow 10250/tcp
sudo ufw allow 10251/tcp
sudo ufw allow 10252/tcp
sudo ufw status

#worker
sudo ufw enable
sudo ufw allow 10250/tcp
sudo ufw allow 30000:32767/tcp
sudo ufw status

Docker : #11

cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

// curl https://releases.rancher.com/install-docker/20.10.sh | sh 
curl https://releases.rancher.com/install-docker/19.03.sh | sh
sudo usermod -aG docker YOUR_USERNAME

Creating the RKE Cluster

// Copy pub key to target server to be authorized ip ssh-copy-id -i ~/.ssh/id_rsa.pub ubuntu@10.0.1.200

// Test ssh ~/.ssh/id_rsa ubuntu@10.0.1.200

- rke config

$ cat rancher-cluster.yml nodes:

services: etcd: backup_config: interval_hours: 12 retention: 6 $ rke up --config rancher-cluster.yml

- test

export KUBECONFIG=kube_config_rancher-cluster.yml kubectl cluster-info kubectl get pod -A


Install Rancher : #5 rancher 설치 참조

Reference
- https://rancher.com/docs/rancher/v2.5/en/installation/other-installation-methods/behind-proxy/launch-kubernetes/
- https://rancher.com/docs/rancher/v2.5/en/installation/other-installation-methods/behind-proxy/install-rancher/
- https://majjangjjang.tistory.com/144
GunSik2 commented 3 years ago

Error : 10248/kubelet 접속 오류

ERRO[0097] Failed to upgrade worker components on NotReady hosts, error: [Failed to verify healthcheck: Failed to check http://localhost:10248/healthz for service [kubelet] on host [14.xx.xx.xxx]: Get "http://localhost:10248/healthz": Unable to access the service on localhost:10248. The service might be still starting up. Error: ssh: rejected: connect failed (Connection refused), log:     /workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:571 +0x8c] 
ERRO[0122] Host 14.xx.xx.xxx  failed to report Ready status with error: [controlplane] Error getting node 14.xx.xx.xxx:  "14.xx.xx.xxx" not found 
ERRO[0147] Failed to upgrade hosts: 14.xx.xx.xxx with error [[controlplane] Error getting node 14.xx.xx.xxx:  "14.xx.xx.xxx" not found] 
FATA[0147] [controlPlane] Failed to upgrade Control Plane: [[[controlplane] Error getting node 14.xx.xx.xxx:  "14.xx.xx.xxx" not found]] 

$ sudo netstat -ntpl | grep kub tcp 0 0 127.0.0.1:10248 0.0.0.0: LISTEN 902179/kubelet
tcp 0 0 127.0.0.1:10249 0.0.0.0:
LISTEN 903347/kube-proxy
tcp 0 0 127.0.0.1:10251 0.0.0.0: LISTEN 903015/kube-schedul tcp 0 0 127.0.0.1:10252 0.0.0.0: LISTEN 903016/kube-control tcp 0 0 127.0.0.1:10256 0.0.0.0: LISTEN 903347/kube-proxy
tcp6 0 0 :::10250 :::
LISTEN 902179/kubelet
tcp6 0 0 :::6443 :::* LISTEN 902724/kube-apiserv