zouhuigang / book

📕📗📘收集的各种书籍,pdf,ppt,doc资料,下载链接永久有效!
736 stars 320 forks source link

kubernetes高可用集群布署.mht #14

Open zouhuigang opened 6 years ago

zouhuigang commented 6 years ago

链接:https://pan.baidu.com/s/1c2DAoIO 密码:vg3q

zouhuigang commented 6 years ago
                                kubernetes高可用集群布署

系统centos7 所有节点是yum 安装的kubernetes 1.5版本

前提是集群已经正确运行没有任何问题

原理参照 kubadm 和kargo 三个master做高可用所有node上安装haproxy 负载均衡反代三台kube-apiserver的8080端口 api-server为无状态服务

注意,之前用过nginx配置反代三台apiserver出现创建pod 容器非常慢,3-5分钟,应该有BUG ,建议用haproxy反代,非常顺畅.

controller-manager 和scheduler 为有状态服务,同一时间只有一台当选,会在三台master机之间进行选举,由其中一台担任leader的角色

节点构造如下

cat /etc/hosts

master

192.168.1.61 master1.txg.com #512M 192.168.1.62 master2.txg.com #512M 192.168.1.63 master2.txg.com #512M

master软件包

[root@master1 kubernetes]# rpm -qa|grep kube kubernetes-client-1.5.2-0.2.gitc55cf2b.el7.x86_64 kubernetes-master-1.5.2-0.2.gitc55cf2b.el7.x86_64 flannel-0.7.0-1.el7.x86_64

etcd-server

192.168.1.65 etcd1.txg.com #512M 192.168.1.66 etcd2.txg.com #512M 192.168.1.67 etcd3.txg.com #512M

node节点

192.168.1.68 node1.txg.com #4G 192.168.1.69 node2.txg.com #4G 192.168.2.68 node3.txg.com #4G 192.168.2.69 node4.txg.com #4G

node节点软件包

[root@node4 ~]# rpm -qa|egrep 'kube|docker' kubernetes-client-1.5.2-0.5.gita552679.el7.x86_64 docker-common-1.12.6-11.el7.centos.x86_64 docker-1.12.6-11.el7.centos.x86_64 kubernetes-node-1.5.2-0.5.gita552679.el7.x86_64 docker-client-1.12.6-11.el7.centos.x86_64 flannel-0.7.0-1.el7.x86_64

[root@node4 ~]# uname -a Linux node4.txg.com 3.10.0-514.6.2.el7.x86_64 #1 SMP Thu Feb 23 03:04:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

修改master server上的配置文件,我的配置文件在/etc/kubernetes/下面 [root@master1 kubernetes]# pwd /etc/kubernetes [root@master1 kubernetes]# ls apiserver config controller-manager scheduler ssl sslbk

1.修改controller-manager和scheduler配置文件在KUBE_CONTROLLER_MANAGER_ARGS=" " 中间加入 --address=127.0.0.1 --leader-elect=true KUBE_CONTROLLER_MANAGER_ARGS=" --address=127.0.0.1 --leader-elect=true --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem --cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem --cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem --root-ca-file=/etc/kubernetes/ssl/ca.pem" 修改scheduler 为 KUBE_SCHEDULER_ARGS=" --address=127.0.0.1 --leader-elect=true" 让节点有选举master leader 功能,ok master配置完成

同步master1上的配置文件到master2 3 节点

2.所有node节点安装 haproxy , yum install haproxy

配置haproxy.cfg文件 监听5002端口,所向代理kube-apiserver:8080

[root@node4 ~]# cat /etc/haproxy/haproxy.cfg

---------------------------------------------------------------------

Example configuration for a possible web application. See the

full configuration options online.

#

http://haproxy.1wt.eu/download/1.4/doc/configuration.txt

#

---------------------------------------------------------------------

---------------------------------------------------------------------

Global settings

---------------------------------------------------------------------

global

to have these messages end up in /var/log/haproxy.log you will

# need to:
#
# 1) configure syslog to accept network log events.  This is done
#    by adding the '-r' option to the SYSLOGD_OPTIONS in
#    /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
#   file. A line like the following can be added to
#   /etc/sysconfig/syslog
#
#
log         127.0.0.1 local3

#      local2.*                 /var/log/haproxy.log
chroot      /var/lib/haproxy
pidfile     /var/run/haproxy.pid
maxconn     4000
user        haproxy
group       haproxy
daemon

# turn on stats unix socket
stats socket /var/lib/haproxy/stats

---------------------------------------------------------------------

common defaults that all the 'listen' and 'backend' sections will

use if not designated in their block

---------------------------------------------------------------------

defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 3000

---------------------------------------------------------------------

main frontend which proxys to the backends

---------------------------------------------------------------------

frontend main *:5002 stats uri /haproxy

acl url_static path_beg -i /static /images /javascript /stylesheets

acl url_static path_end -i .jpg .gif .png .css .js

use_backend static if url_static

default_backend             app

---------------------------------------------------------------------

static backend for serving up images, stylesheets and such

---------------------------------------------------------------------

backend static

balance roundrobin

server static 127.0.0.1:4331 check

---------------------------------------------------------------------

round robin balancing between the various backends

---------------------------------------------------------------------

backend app mode http balance roundrobin server app1 192.168.1.61:8080 check server app2 192.168.1.62:8080 check server app3 192.168.1.63:8080 check

server 部份按照自己apiserver 三台 配置进来即可

3.配置rsyslog收集haproxy日志 [root@node4 ~]# echo -e '$ModLoad imudp \n $UDPServerRun 514 \n local3.* /var/log/haproxy.log' >> /etc/rsyslog.conf

4.配置node节点配置

配置config 文件 KUBE_MASTER="--master=http://127.0.0.1:5002" 参数指向haproxy的5002端口

[root@node4 kubernetes]# pwd /etc/kubernetes [root@node4 kubernetes]# ls config kubelet proxy

[root@node4 kubernetes]# cat config

kubernetes system config

#

The following values are used to configure various aspects of all

kubernetes services, including

#

kube-apiserver.service

kube-controller-manager.service

kube-scheduler.service

kubelet.service

kube-proxy.service

logging to stderr means we get it in the systemd journal

KUBE_LOGTOSTDERR="--logtostderr=true"

journal message level, 0 is debug

KUBE_LOG_LEVEL="--v=0"

Should this cluster be allowed to run privileged docker containers

KUBE_ALLOW_PRIV="--allow-privileged=true"

How the controller-manager, scheduler, and proxy find the apiserver

KUBE_MASTER="--master=http://127.0.0.1:5002"

配置kubelet KUBELET_API_SERVER="--api-servers=http://127.0.0.1:5002"

[root@node4 kubernetes]# cat kubelet

kubernetes kubelet (minion) config

The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)

KUBELET_ADDRESS="--address=0.0.0.0"

The port for the info server to serve on

KUBELET_PORT="--port=10250"

You may leave this blank to use the actual hostname

KUBELET_HOSTNAME="--hostname-override=192.168.2.69"

location of the api-server

KUBELET_API_SERVER="--api-servers=http://127.0.0.1:5002"

pod infrastructure container

KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest"

Add your own!

KUBELET_ARGS="--cluster_dns=172.1.0.2 --cluster_domain=cluster.local"

所有node节点照此配置完成

5.#重启所有node节点上的服务,在这里我用ansible来处理,ansible请自行脑补,建议大家用ansible来批量处理会快很多

没有安装ansible的,请自行手动重启

[root@master1 ~]# ansible -m shell -a ' systemctl restart rsyslog.service ;service haproxy restart ;systemctl restart kubelet.service;systemctl restart kube-proxy.service' 'nodes' node3.txg.com | SUCCESS | rc=0 >> Redirecting to /bin/systemctl restart haproxy.service

node4.txg.com | SUCCESS | rc=0 >> Redirecting to /bin/systemctl restart haproxy.service

node2.txg.com | SUCCESS | rc=0 >> Redirecting to /bin/systemctl restart haproxy.service

node1.txg.com | SUCCESS | rc=0 >> Redirecting to /bin/systemctl restart haproxy.service

查看所有node上 haproxy 日志 200为正常

[root@node3 kubernetes]# tail -f /var/log/haproxy.log 2017-05-09T11:23:12+08:00 localhost haproxy[18278]: 127.0.0.1:42970 [09/May/2017:11:23:11.992] main app/app1 52/0/0/186/238 200 2507 - - ---- 6/6/5/2/0 0/0 "PUT /api/v1/nodes/192.168.2.69/status HTTP/1.1" 2017-05-09T11:23:22+08:00 localhost haproxy[18278]: 127.0.0.1:42970 [09/May/2017:11:23:12.229] main app/app2 10000/0/1/1/10002 200 2519 - - ---- 6/6/5/1/0 0/0 "GET /api/v1/nodes?fieldSelector=metadata.name%3D192.168.2.69&resourceVersion=0 HTTP/1.1" 2017-05-09T11:23:22+08:00 localhost haproxy[18278]: 127.0.0.1:42970 [09/May/2017:11:23:22.232] main app/app3 60/0/0/123/183 200 2507 - - ---- 6/6/5/2/0 0/0 "PUT /api/v1/nodes/192.168.2.69/status HTTP/1.1" 2017-05-09T11:23:28+08:00 localhost haproxy[18278]: 127.0.0.1:42722 [09/May/2017:11:22:21.385] main app/app1 7384/0/1/0/67387 200 167 - - sD-- 5/5/4/1/0 0/0 "GET /api/v1/watch/pods?fieldSelector=spec.nodeName%3D192.168.2.69&resourceVersion=2348326&timeoutSeconds=424 HTTP/1.1" 2017-05-09T11:23:32+08:00 localhost haproxy[18278]: 127.0.0.1:43096 [09/May/2017:11:23:32.416] main app/app2 0/0/0/1/1 200 2519 - - ---- 6/6/5/1/0 0/0 "GET /api/v1/nodes?fieldSelector=metadata.name%3D192.168.2.69&resourceVersion=0 HTTP/1.1" 2017-05-09T11:23:32+08:00 localhost haproxy[18278]: 127.0.0.1:43096 [09/May/2017:11:23:32.418] main app/app3 53/0/0/92/145 200 2507 - - ---- 6/6/5/2/0 0/0 "PUT /api/v1/nodes/192.168.2.69/status HTTP/1.1" 2017-05-09T11:23:35+08:00 localhost haproxy[18278]: 127.0.0.1:43096 [09/May/2017:11:23:32.564] main app/app1 2459/0/1/1/2461 200 2507 - - ---- 6/6/5/3/0 0/0 "GET /api/v1/namespaces/kube-system/secrets/default-token-p5l8p HTTP/1.1" 2017-05-09T11:23:42+08:00 localhost haproxy[18278]: 127.0.0.1:38410 [09/May/2017:11:14:38.515] main app/app3 0/0/1/1/544002 200 254800 - - ---- 6/6/4/1/0 0/0 "GET /api/v1/watch/endpoints?resourceVersion=2347840&timeoutSeconds=544 HTTP/1.1" 2017-05-09T11:23:42+08:00 localhost haproxy[18278]: 127.0.0.1:43096 [09/May/2017:11:23:35.024] main app/app3 7540/0/0/1/7541 200 2519 - - ---- 6/6/5/1/0 0/0 "GET /api/v1/nodes?fieldSelector=metadata.name%3D192.168.2.69&resourceVersion=0 HTTP/1.1" 2017-05-09T11:23:42+08:00 localhost haproxy[18278]: 127.0.0.1:43096 [09/May/2017:11:23:42.566] main app/app1 51/0/1/111/163 200 2507 - - ---- 6/6/5/2/0 0/0 "PUT /api/v1/nodes/192.168.2.69/status HTTP/1.1"

重启所有master节点上的服务

ansible -m shell -a 'systemctl restart kube-apiserver.service;systemctl restart kube-controller-manager.service ;systemctl restart kube-scheduler.service ' 'masters'

6.查看leader信息位于哪个节点

[root@master3 ~]# tail -f /var/log/messages May 9 11:09:43 master1 kube-scheduler: I0509 11:09:43.354272 4636 leaderelection.go:247] lock is held by master3.txg.com and has not yet expired May 9 11:09:43 master1 kube-controller-manager: I0509 11:09:43.887592 4532 leaderelection.go:247] lock is held by master2.txg.com and has not yet expired

这时, kube-scheduler leader位于master3 和kube-controller-manager 在master2

[root@master3 ~]# kubectl -n kube-system get ep kube-controller-manager -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master2.txg.com","leaseDurationSeconds":15,"acquireTime":"2017-05-08T10:41:07Z","renewTime":"2017-05-09T03:14:02Z","leaderTransitions":0}' creationTimestamp: 2017-05-08T10:41:07Z name: kube-controller-manager namespace: kube-system resourceVersion: "2347791" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: d7dae24f-33da-11e7-9a51-525400c2bc59 subsets: [] [root@master1 ~]# kubectl -n kube-system get ep kube-scheduler -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master3.txg.com","leaseDurationSeconds":15,"acquireTime":"2017-05-08T10:41:08Z","renewTime":"2017-05-09T03:14:27Z","leaderTransitions":0}' creationTimestamp: 2017-05-08T10:41:08Z name: kube-scheduler namespace: kube-system resourceVersion: "2347830" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler uid: d87a235a-33da-11e7-9eb5-52540081c06a subsets: []

至此配置高可用集群配置完成