Open nejinn opened 1 month ago
想了想,打算自己加etcd的配置文件。看了一下kubekey生成etcd配置文件的template。 如下:
/*
Copyright 2021 The KubeSphere Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package templates
import (
"text/template"
"github.com/lithammer/dedent"
)
// EtcdEnv defines the template of etcd's env.
var EtcdEnv = template.Must(template.New("etcd.env").Parse(
dedent.Dedent(`# Environment file for etcd {{ .Tag }}
{{- if .DataDir }}
ETCD_DATA_DIR={{ .DataDir }}
{{- else }}
ETCD_DATA_DIR=/var/lib/etcd
{{- end }}
ETCD_ADVERTISE_CLIENT_URLS=https://{{ .Ip }}:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://{{ .Ip }}:2380
ETCD_INITIAL_CLUSTER_STATE={{ .State }}
ETCD_METRICS=basic
ETCD_LISTEN_CLIENT_URLS=https://{{ .Ip }}:2379,https://127.0.0.1:2379
ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd
ETCD_LISTEN_PEER_URLS=https://{{ .Ip }}:2380
ETCD_NAME={{ .Name }}
ETCD_PROXY=off
ETCD_ENABLE_V2=true
ETCD_INITIAL_CLUSTER={{ .PeerAddresses }}
{{- if .ElectionTimeout }}
ETCD_ELECTION_TIMEOUT={{ .ElectionTimeout }}
{{- else }}
ETCD_ELECTION_TIMEOUT=5000
{{- end }}
{{- if .HeartbeatInterval }}
ETCD_HEARTBEAT_INTERVAL={{ .HeartbeatInterval }}
{{- else }}
ETCD_HEARTBEAT_INTERVAL=250
{{- end }}
{{- if .CompactionRetention }}
ETCD_AUTO_COMPACTION_RETENTION={{ .CompactionRetention }}
{{- else }}
ETCD_AUTO_COMPACTION_RETENTION=8
{{- end }}
{{- if .SnapshotCount }}
ETCD_SNAPSHOT_COUNT={{ .SnapshotCount }}
{{- else }}
ETCD_SNAPSHOT_COUNT=10000
{{- end }}
{{- if .Metrics }}
ETCD_METRICS={{ .Metrics }}
{{- end }}
{{- if .QuotaBackendBytes }}
ETCD_QUOTA_BACKEND_BYTES={{ .QuotaBackendBytes }}
{{- end }}
{{- if .MaxRequestBytes }}
ETCD_MAX_REQUEST_BYTES={{ .MaxRequestBytes }}
{{- end }}
{{- if .MaxSnapshots }}
ETCD_MAX_SNAPSHOTS={{ .MaxSnapshots }}
{{- end }}
{{- if .MaxWals }}
ETCD_MAX_WALS={{ .MaxWals }}
{{- end }}
{{- if .LogLevel }}
ETCD_LOG_LEVEL={{ .LogLevel }}
{{- end }}
{{- if .UnsupportedArch }}
ETCD_UNSUPPORTED_ARCH={{ .Arch }}
{{ end }}
# TLS settings
ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-{{ .Hostname }}.pem
ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-{{ .Hostname }}-key.pem
ETCD_CLIENT_CERT_AUTH=true
ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-{{ .Hostname }}.pem
ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-{{ .Hostname }}-key.pem
ETCD_PEER_CLIENT_CERT_AUTH=true
# CLI settings
ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_KEY=/etc/ssl/etcd/ssl/admin-{{ .Hostname }}-key.pem
ETCDCTL_CERT=/etc/ssl/etcd/ssl/admin-{{ .Hostname }}.pem
`)))
这里有些变量名称,还得斟酌一下,怕填错。
有没有大佬一起填一下这个变量
这个问题我已经解决了。每一个etcd都加一个空配置文件。文件里加上etcdname,一般都是 ETCD_NAME=etcd-master1
然后运行安装命令。这时候还会报错。 到其他节点打开etcd的配置文件,会看到有完整的配置文件了。抄到master1改一改。 每一个节点都去启动etcd 再运行安装命令 解决
这个问题我已经解决了。每一个etcd都加一个空配置文件。文件里加上etcdname,一般都是 ETCD_NAME=etcd-master1
然后运行安装命令。这时候还会报错。 到其他节点打开etcd的配置文件,会看到有完整的配置文件了。抄到master1改一改。 每一个节点都去启动etcd 再运行安装命令 解决
请问抄到master改一改, 是怎么改一改?
件不存在,就代表InstallETCDBinaryModule安装etcd之后,是无法启动etcd的,导致后面ETCDConfigureModule中existETCDHealthCheck是会报错的。因为etcd没启动。 看一下当前节点etcd的状态
● etcd.service - etcd Loaded: loaded (/etc/systemd/system/etcd.service; disabled; vendor preset: disabled) Active: inactive (dead) Jul 11 23:06:30 master1 systemd[1]: Unit etcd.service entered failed state. Jul 11 23:06:30 master1 systemd[1]: etcd.service failed. Jul 11 23:06:40 master1 systemd[1]: etcd.service holdoff time over, scheduling restart. Jul 11 23:06:40 master1 systemd[1]: Stopped etcd. Jul 11 23:06:40 master1 systemd[1]: Failed to load environment files: No such file or directory Jul 11 23:06:40 master1 systemd[1]: etcd.service failed to run 'start' task: No such file or directory Jul 11 23:06:40 master1 systemd[1]: Failed to start etcd. Jul 11 23:06:40 master1 systemd[1]: Unit etcd.service entered failed state. Jul 11 23:06:40 master1 systemd[1]: etcd.service failed. Jul 11 23:06:41 master1 systemd[1]: Stopped etcd.
那么问题出来了,这个问题出现的原因是 在没用生成 etcd.env的前提下,检查了etcd的健康度。
是否可以出一个紧急的版本修复一下这问题?
把etcdname和相对应的ip改掉就行
多谢, 我用最新的pre release的3.1.2安装就没问题了.
What is version of KubeKey has the issue?
v3.0.13
What is your os environment?
centos7
KubeKey config file
A clear and concise description of what happend.
[ETCDConfigureModule] Health check on exist etcd
Relevant log output
我看了一下源码,到了ETCDConfigureModule。 应该就是到了上图红框里这部分。
因为是新建集群,那么我们就认为是走的handleNewCluster
handleNewCluster的源码如下:
existETCDHealthCheck 按道理应该是这个handleNewCluster里放到restart后才执行的才对吧。新建集群,在ETCDConfigureModule之前执行了InstallETCDBinaryModule,那么代表etcd是安装了,但是InstallETCDBinaryModule并没有生成 etcd.env。所以接下来在handleNewCluster即ConfigureModule中的执行顺序应该是
其实 existETCDHealthCheck,在新建集群是不是都没必要了。
然后我看了一下etcd.service
配置文件在 /etc/etcd.env
查了一下这个,文件不存在。 配置文件不存在,就代表InstallETCDBinaryModule安装etcd之后,是无法启动etcd的,导致后面ETCDConfigureModule中existETCDHealthCheck是会报错的。因为etcd没启动。 看一下当前节点etcd的状态
那么问题出来了,这个问题出现的原因是 在没用生成 etcd.env的前提下,检查了etcd的健康度。
是否可以出一个紧急的版本修复一下这问题?