yunionio / cloudpods

A cloud-native open-source unified multi-cloud and hybrid-cloud platform. 开源、云原生的多云管理及混合云融合平台
https://www.cloudpods.org
Apache License 2.0
2.55k stars 520 forks source link

[求助/Help] 高可用部署问题 #21068

Open tomorrrow666 opened 3 weeks ago

tomorrrow666 commented 3 weeks ago

TASK [primary-master-node/setup_k8s : Use ocadm init first master node] **** fatal: [192.168.231.132]: FAILED! => {"changed": true, "cmd": "/opt/yunion/bin/ocadm init --control-plane-endpoint 192.168.230.1:6443 --mysql-host 192.168.231.136 --mysql-user root --mysql-password 0neC1oudDb# --mysql-port 3306 --image-repository registry.cn-beijing.aliyuncs.com/yunion --apiserver-advertise-address 192.168.231.132 --node-ip 192.168.231.132 --host-networks ens33/br0/192.168.231.132 --enable-hugepage --onecloud-version v3.11.6 --operator-version v3.11.6 --pod-network-cidr 10.40.0.0/16 --service-cidr 10.96.0.0/12 --service-dns-domain cluster.local --addon-calico-ip-autodetection-method can-reach=192.168.231.132 --high-availability-vip 192.168.230.1 --keepalived-version-tag v2.0.25 --enable-host-agent\n", "delta": "0:00:00.055953", "end": "2024-08-22 14:59:14.427135", "msg": "non-zero return code", "rc": 1, "start": "2024-08-22 14:59:14.371182", "stderr": "error execution phase preflight: [preflight] Some fatal errors occurred:\n\t[ERROR Mysql]: show grants for root@%: Error 1130: Host '192.168.231.132' is not allowed to connect to this MariaDB server\n[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...", "stderr_lines": ["error execution phase preflight: [preflight] Some fatal errors occurred:", "\t[ERROR Mysql]: show grants for root@%: Error 1130: Host '192.168.231.132' is not allowed to connect to this MariaDB server", "[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=..."], "stdout": "[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6\n[preflight] Running pre-flight checks", "stdout_lines": ["[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6", "[preflight] Running pre-flight checks"]}

swordqiu commented 3 weeks ago

@tomorrrow666 是否方便把 config-k8s-ha.yml 去掉敏感信息上传?

swordqiu commented 3 weeks ago

@tomorrrow666 可能文档没有说清楚,yaml里的$开头的都是一些变量,可以认为文档提供的这个yaml只是一个模板,在使用时需要把这些变量替换为您的场景的实际的值。比如PRIMARY_IP是高可用主节点的IP地址。

swordqiu commented 3 weeks ago

@tomorrrow666 把yaml文件里的变量替换成上面的值,然后再运行ocboot

tomorrrow666 commented 3 weeks ago

主机名改成这几个? k8s primary k8s master 1 k8s master 2 DB 有空格么?

但是还有其他yml文件也用了$变量了,

zexi commented 3 weeks ago

主机名改成这几个? k8s primary k8s master 1 k8s master 2 DB 有空格么?

但是还有其他yml文件也用了$变量了,

请仔细阅读这个文档:https://www.cloudpods.org/docs/getting-started/onpremise/ha-ce ,先在 bash 里面定义这些变量:

image

然后执行后面的 命令把配置文件生成:

 cat > config-k8s-ha.yml <<EOF
....
EOF

不是直接复制里面的内容。

zexi commented 3 weeks ago

@tomorrrow666 另外麻烦提 issue 的时候加上标题。

zexi commented 3 weeks ago

相同问题:https://github.com/yunionio/cloudpods/issues/21064

tomorrrow666 commented 3 weeks ago

TASK [primary-master-node/setup_k8s : Use ocadm init first master node] **** fatal: [192.168.231.132]: FAILED! => {"changed": true, "cmd": "/opt/yunion/bin/ocadm init --control-plane-endpoint 192.168.230.1:6443 --mysql-host 192.168.231.136 --mysql-user root --mysql-password 0neC1oudDb# --mysql-port 3306 --image-repository registry.cn-beijing.aliyuncs.com/yunion --apiserver-advertise-address 192.168.231.132 --node-ip 192.168.231.132 --host-networks ens33/br0/192.168.231.132 --enable-hugepage --onecloud-version v3.11.6 --operator-version v3.11.6 --pod-network-cidr 10.40.0.0/16 --service-cidr 10.96.0.0/12 --service-dns-domain cluster.local --addon-calico-ip-autodetection-method can-reach=192.168.231.132 --high-availability-vip 192.168.230.1 --keepalived-version-tag v2.0.25 --enable-host-agent\n", "delta": "0:00:00.055953", "end": "2024-08-22 14:59:14.427135", "msg": "non-zero return code", "rc": 1, "start": "2024-08-22 14:59:14.371182", "stderr": "error execution phase preflight: [preflight] Some fatal errors occurred:\n\t[ERROR Mysql]: show grants for root@%: Error 1130: Host '192.168.231.132' is not allowed to connect to this MariaDB server\n[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...", "stderr_lines": ["error execution phase preflight: [preflight] Some fatal errors occurred:", "\t[ERROR Mysql]: show grants for root@%: Error 1130: Host '192.168.231.132' is not allowed to connect to this MariaDB server", "[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=..."], "stdout": "[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6\n[preflight] Running pre-flight checks", "stdout_lines": ["[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6", "[preflight] Running pre-flight checks"]}

zexi commented 3 weeks ago

[ERROR Mysql]: show grants for root@%: Error 1130: Host '192.168.231.132' is not allowed to connect to this MariaDB server",

需要排查 mysql 能否正常连接

tomorrrow666 commented 3 weeks ago

TASK [common : init apt cache for Ubuntu] ** fatal: [192.168.231.132]: FAILED! => {"changed": false, "dest": "/tmp/yunion.gpg-key.asc", "elapsed": 20, "msg": "Request failed: <urlopen error [Errno -3] 域名解析出现暂时性错误>", "url": "https://iso.yunion.cn/ubuntu/22/base/x86_64/yunion.gpg-key.asc"} fatal: [192.168.231.134]: FAILED! => {"changed": false, "dest": "/tmp/yunion.gpg-key.asc", "elapsed": 20, "msg": "Request failed: <urlopen error [Errno -3] 域名解析出现暂时性错误>", "url": "https://iso.yunion.cn/ubuntu/22/base/x86_64/yunion.gpg-key.asc"} fatal: [192.168.231.133]: FAILED! => {"changed": false, "dest": "/tmp/yunion.gpg-key.asc", "elapsed": 20, "msg": "Request failed: <urlopen error [Errno -3] 域名解析出现暂时性错误>", "url": "https://iso.yunion.cn/ubuntu/22/base/x86_64/yunion.gpg-key.asc"}

NO MORE HOSTS LEFT *****

tomorrrow666 commented 2 weeks ago

TASK [primary-master-node/setup_k8s : Use ocadm init first master node] **** fatal: [192.168.43.138]: FAILED! => {"changed": true, "cmd": "/opt/yunion/bin/ocadm init --control-plane-endpoint 192.168.43.1:6443 --mysql-host 192.168.43.141 --mysql-user root --mysql-password 0neC1oudDB# --mysql-port 3306 --image-repository registry.cn-beijing.aliyuncs.com/yunion --apiserver-advertise-address 192.168.43.138 --node-ip 192.168.43.138 --host-networks ens33/br0/192.168.43.138 --enable-hugepage --onecloud-version v3.11.6 --operator-version v3.11.6 --pod-network-cidr 10.40.0.0/16 --service-cidr 10.96.0.0/12 --service-dns-domain cluster.local --addon-calico-ip-autodetection-method can-reach=192.168.43.138 --high-availability-vip 192.168.43.1 --keepalived-version-tag v2.0.25 --enable-host-agent\n", "delta": "0:00:00.315037", "end": "2024-08-25 21:27:33.214740", "msg": "non-zero return code", "rc": 1, "start": "2024-08-25 21:27:32.899703", "stderr": "\t[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.24. Latest validated version: 18.09\nerror execution phase preflight: k8s init node checks: [preflight] Some fatal errors occurred:\n\t[ERROR SystemVerification]: unsupported kernel release: 6.8.0-40-generic\n[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...", "stderr_lines": ["\t[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.24. Latest validated version: 18.09", "error execution phase preflight: k8s init node checks: [preflight] Some fatal errors occurred:", "\t[ERROR SystemVerification]: unsupported kernel release: 6.8.0-40-generic", "[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=..."], "stdout": "[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6\n[preflight] Running pre-flight checks\n[preflight] The system verification failed. Printing the output from the verification:\n\u001b[0;37mKERNEL_VERSION\u001b[0m: \u001b[0;31m6.8.0-40-generic\u001b[0m\n\u001b[0;37mCONFIG_NAMESPACES\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_NET_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_PID_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_IPC_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_UTS_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUPS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_DEVICE\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_SCHED\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CPUSETS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_MEMCG\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_INET\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_EXT4_FS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_PROC_FS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_NETFILTER_XT_TARGET_REDIRECT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m\n\u001b[0;37mCONFIG_NETFILTER_XT_MATCH_COMMENT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m\n\u001b[0;37mCONFIG_OVERLAY_FS\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m\n\u001b[0;37mCONFIG_AUFS_FS\u001b[0m: \u001b[0;33mnot set - Required for aufs.\u001b[0m\n\u001b[0;37mCONFIG_BLK_DEV_DM\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mDOCKER_VERSION\u001b[0m: \u001b[0;32m20.10.24\u001b[0m\n\u001b[0;37mOS\u001b[0m: \u001b[0;32mLinux\u001b[0m\n\u001b[0;37mCGROUPS_CPU\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_CPUSET\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_DEVICES\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_MEMORY\u001b[0m: \u001b[0;32menabled\u001b[0m", "stdout_lines": ["[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6", "[preflight] Running pre-flight checks", "[preflight] The system verification failed. Printing the output from the verification:", "\u001b[0;37mKERNEL_VERSION\u001b[0m: \u001b[0;31m6.8.0-40-generic\u001b[0m", "\u001b[0;37mCONFIG_NAMESPACES\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_NET_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_PID_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_IPC_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_UTS_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUPS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_DEVICE\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_SCHED\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CPUSETS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_MEMCG\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_INET\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_EXT4_FS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_PROC_FS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_NETFILTER_XT_TARGET_REDIRECT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m", "\u001b[0;37mCONFIG_NETFILTER_XT_MATCH_COMMENT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m", "\u001b[0;37mCONFIG_OVERLAY_FS\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m", "\u001b[0;37mCONFIG_AUFS_FS\u001b[0m: \u001b[0;33mnot set - Required for aufs.\u001b[0m", "\u001b[0;37mCONFIG_BLK_DEV_DM\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mDOCKER_VERSION\u001b[0m: \u001b[0;32m20.10.24\u001b[0m", "\u001b[0;37mOS\u001b[0m: \u001b[0;32mLinux\u001b[0m", "\u001b[0;37mCGROUPS_CPU\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_CPUSET\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_DEVICES\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_MEMORY\u001b[0m: \u001b[0;32menabled\u001b[0m"]}

tomorrrow666 commented 2 weeks ago

ok: [192.168.125.132] => (item=[02/37] yunion-climc=3.11) failed: [192.168.125.129] (item=[02/37] yunion-climc=3.11) => {"ansible_index_var": "item_index", "ansible_loop_var": "package_item", "cache_update_time": 1724640218, "cache_updated": false, "changed": false, "item_index": 1, "msg": "'/usr/bin/apt-get -y -o \"Dpkg::Options::=--force-confdef\" -o \"Dpkg::Options::=--force-confold\" install 'yunion-climc=3.11.6-24081720'' failed: E: 降级软件包同时使用了 -y 选项,但是没有用 --allow-downgrades.\n", "package_item": "yunion-climc=3.11*", "rc": 100, "stderr": "E: 降级软件包同时使用了 -y 选项,但是没有用 --allow-downgrades.\n", "stderr_lines": ["E: 降级软件包同时使用了 -y 选项,但是没有用 --allow-downgrades."], "stdout": "正在读取软件包列表...\n正在分析软件包的依赖关系树...\n正在读取状态信息...\n下列软件包是自动安装的并且现在不需要了:\n ieee-data python3-argcomplete python3-dnspython python3-libcloud\n python3-netaddr python3-pycryptodome python3-requests-toolbelt\n python3-simplejson\n使用'apt autoremove'来卸载它(它们)。\n下列 软件包将被【降级】:\n yunion-climc\n升级了 0 个软件包,新安装了 0 个软件包,降级了 1 个软件包,要卸载 0 个软件包,有 13 个软件包未被升级。\n", "stdout_lines": ["正 在读取软件包列表...", "正在分析软件包的依赖关系树...", "正在读取状态信息...", "下列软件包是自动安装的并且现在不需要了:", " ieee-data python3-argcomplete python3-dnspython python3-libcloud", " python3-netaddr python3-pycryptodome python3-requests-toolbelt", " python3-simplejson", "使用'apt autoremove'来卸载它(它们)。", "下列软 件包将被【降级】:", " yunion-climc", "升级了 0 个软件包,新安装了 0 个软件包,降级了 1 个软件包,要卸载 0 个软件包,有 13 个软件包未被升级。"]}