Closed cyl-007 closed 2 years ago
@Ficus-f fix 这个问题,找到报错的地方,错误的地方要返回不能继续往下执行,可以用一个错误的镜像地址进行测试
[root@ip-172-31-39-15 ec2-user]# sealos run labring/kubernetes:v1.24.0 labring/calico:v3.22.1 --masters 172.31.39.15
2022-06-27 14:18:19 [INFO] start to install app in this cluster
2022-06-27 14:18:19 [INFO] succeeded install app in this cluster: no change apps
2022-06-27 14:18:19 [INFO] start to scale this cluster
2022-06-27 14:18:19 [INFO] succeeded in scaling this cluster: no change nodes
2022-06-27 14:18:19 [INFO]
___ ___ ___ ___ ___ ___
/\ \ /\ \ /\ \ /\__\ /\ \ /\ \
/::\ \ /::\ \ /::\ \ /:/ / /::\ \ /::\ \
/:/\ \ \ /:/\:\ \ /:/\:\ \ /:/ / /:/\:\ \ /:/\ \ \
_\:\~\ \ \ /::\~\:\ \ /::\~\:\ \ /:/ / /:/ \:\ \ _\:\~\ \ \
/\ \:\ \ \__\ /:/\:\ \:\__\ /:/\:\ \:\__\ /:/__/ /:/__/ \:\__\ /\ \:\ \ \__\
\:\ \:\ \/__/ \:\~\:\ \/__/ \/__\:\/:/ / \:\ \ \:\ \ /:/ / \:\ \:\ \/__/
\:\ \:\__\ \:\ \:\__\ \::/ / \:\ \ \:\ /:/ / \:\ \:\__\
\:\/:/ / \:\ \/__/ /:/ / \:\ \ \:\/:/ / \:\/:/ /
\::/ / \:\__\ /:/ / \:\__\ \::/ / \::/ /
\/__/ \/__/ \/__/ \/__/ \/__/ \/__/
Website :https://www.sealos.io/
Address :github.com/labring/sealos
是因为首次失败了 .sealos/default/Clusterfile 生成了,再次执行时就直接没检测到差异就直接提示成功了。 应该与之前修改的一个优化点有关,而不是简单的错误返回。
run 的时候逻辑有问题,应该去检测真实集群与命令行参数的关系,而不应该把命令行和 .sealos/default/Clusterfile 进行对比。
执行失败时要不要生成 Clusterfile 还是有待探讨,或者应该有个字段表明它是失败的。
现在的优化不阻碍执行,不管对错都会保存文件。我感觉可以在最后看一下错误是什么 然后提示出来
type ClusterStatus struct {
Phase ClusterPhase `json:"phase,omitempty"`
Mounts []MountImage `json:"mounts,omitempty"`
Conditions []ClusterCondition `json:"conditions,omitempty" `
}
Phase 是整个的集群的状态
我觉得这还不是一个最优方案,最优的方法应该按照 controller 的编码标准来,每个字段比对->执行->存状态 这样pipeline 每执行到一个阶段都会更新对应的字段 会更合理一些。
我觉得这还不是一个最优方案,最优的方法应该按照 controller 的编码标准来,每个字段比对->执行->存状态 这样pipeline 每执行到一个阶段都会更新对应的字段 会更合理一些。
每一个包的状态应该是定义在包内吧,不然又会像3.0一样,针对kubernetes的搭建写了很多代码在二进制里无法修改。
我也遇到了
我也遇到了
这个问题我们正在修复
022-06-30 15:17:25 [EROR] Applied to cluster error: failed to join node 121.32.254.132:33568 failed to execute command(kubeadm join --config=/var/lib/sealos/data/default/etc/kubeadm-join-node.yaml -v 0) on host(121.32.254.132:33568): output([preflight] Running pre-flight checks
[WARNING FileExisting-socat]: socat not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0630 15:15:30.070122 609824 utils.go:69] The recommended value for "resolvConf" in "KubeletConfiguration" is: /run/systemd/resolve/resolv.conf; the provided value is: /run/systemd/resolve/resolv.conf
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
error execution phase kubelet-start: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher), error(Process exited with status 1)
2022-06-30 15:17:25 [INFO]
___ ___ ___ ___ ___ ___
/\ \ /\ \ /\ \ /\__\ /\ \ /\ \
/::\ \ /::\ \ /::\ \ /:/ / /::\ \ /::\ \
/:/\ \ \ /:/\:\ \ /:/\:\ \ /:/ / /:/\:\ \ /:/\ \ \
_\:\~\ \ \ /::\~\:\ \ /::\~\:\ \ /:/ / /:/ \:\ \ _\:\~\ \ \
/\ \:\ \ \__\ /:/\:\ \:\__\ /:/\:\ \:\__\ /:/__/ /:/__/ \:\__\ /\ \:\ \ \__\
\:\ \:\ \/__/ \:\~\:\ \/__/ \/__\:\/:/ / \:\ \ \:\ \ /:/ / \:\ \:\ \/__/
\:\ \:\__\ \:\ \:\__\ \::/ / \:\ \ \:\ /:/ / \:\ \:\__\
\:\/:/ / \:\ \/__/ /:/ / \:\ \ \:\/:/ / \:\/:/ /
\::/ / \:\__\ /:/ / \:\__\ \::/ / \::/ /
\/__/ \/__/ \/__/ \/__/ \/__/ \/__/
Website :https://www.sealos.io/
Address :github.com/labring/sealos
root@RK05-FRP-A001:~#
root@RK05-FRP-A001:~#
root@RK05-FRP-A001:~#
root@RK05-FRP-A001:~# ./sealos delete --nodes 121.32.254.131 kube^C
root@RK05-FRP-A001:~# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-64897985d-8rbqw 0/1 Pending 0 15m
coredns-64897985d-cvjdp 0/1 Pending 0 15m
etcd-rk05-frp-a001 1/1 Running 0 15m
kube-apiserver-rk05-frp-a001 1/1 Running 0 15m
kube-controller-manager-rk05-frp-a001 1/1 Running 3 15m
kube-proxy-4tdst 1/1 Running 0 15m
kube-proxy-bgwh4 1/1 Running 0 15m
kube-proxy-rvvvw 1/1 Running 0 15m
kube-scheduler-rk05-frp-a001 1/1 Running 3 15m
node如果加入失败 导致后面的流程也没操作。ipvs没加入
Get current cluster:
type xxx Interface {
GetCurrentCluster() (*v2.Cluster,error)
}
Can using
You can merge the logic of operating node.
@cuisongliu 这个问题修复没有, 貌似4.1.0-rc2版本还有这个问题。
我机器默认安装了docker,提示需要卸载docker
2022-08-22T11:13:18 info Executing pipeline RunConfig in CreateProcessor.
2022-08-22T11:13:18 info Executing pipeline MountRootfs in CreateProcessor.
/usr/bin/docker
ERROR [2022-08-22 11:13:33] >> The machine docker is not clean. Please clean docker the system.
2022-08-22T11:13:33 error Applied to cluster error: exit status 1
2022-08-22T11:13:33 info
卸载完docker执行安装, 貌似没有执行安装操作
root@VM-16-26-debian:~# sealos run labring/kubernetes:v1.24.0 labring/calico:v3.22.1 --masters 10.10.16.26
2022-08-22T11:16:29 info sync new version copy pki config: /var/lib/sealos/data/default/pki /root/.sealos/default/pki
2022-08-22T11:16:29 info sync new version copy etc config: /var/lib/sealos/data/default/etc /root/.sealos/default/etc
2022-08-22T11:16:29 info start to install app in this cluster
2022-08-22T11:16:29 info succeeded install app in this cluster: no change apps
2022-08-22T11:16:29 info start to scale this cluster
2022-08-22T11:16:29 info succeeded in scaling this cluster: no change nodes
2022-08-22T11:16:29 info
___ ___ ___ ___ ___ ___
/\ \ /\ \ /\ \ /\__\ /\ \ /\ \
/::\ \ /::\ \ /::\ \ /:/ / /::\ \ /::\ \
/:/\ \ \ /:/\:\ \ /:/\:\ \ /:/ / /:/\:\ \ /:/\ \ \
_\:\~\ \ \ /::\~\:\ \ /::\~\:\ \ /:/ / /:/ \:\ \ _\:\~\ \ \
/\ \:\ \ \__\ /:/\:\ \:\__\ /:/\:\ \:\__\ /:/__/ /:/__/ \:\__\ /\ \:\ \ \__\
\:\ \:\ \/__/ \:\~\:\ \/__/ \/__\:\/:/ / \:\ \ \:\ \ /:/ / \:\ \:\ \/__/
\:\ \:\__\ \:\ \:\__\ \::/ / \:\ \ \:\ /:/ / \:\ \:\__\
\:\/:/ / \:\ \/__/ /:/ / \:\ \ \:\/:/ / \:\/:/ /
\::/ / \:\__\ /:/ / \:\__\ \::/ / \::/ /
\/__/ \/__/ \/__/ \/__/ \/__/ \/__/
Website :https://www.sealos.io/
Address :github.com/labring/sealos
目前只能将/root/.sealos
删除才可以
用 run --force 参数即可
好像这个bug还在,只要第一次安装出现报错(any),第二次安装默认成功,但显示的是未安装成功
各种问题。。反正就是不行
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
Various questions. . It just doesn’t work anyway
sealos版本:4.0.0-rc1 首次安装:由于网络问题失败,如下
解决完网络问题,再次安装,如下
验证实际未安装成功