yunionio / cloudpods

A cloud-native open-source unified multi-cloud and hybrid-cloud platform. 开源、云原生的多云管理及混合云融合平台
https://www.cloudpods.org
Apache License 2.0
2.55k stars 520 forks source link

[求助/Help]安装过程中在ubuntu 24中的systemd与cgropfs问题 #21052

Open rekazer0 opened 1 month ago

rekazer0 commented 1 month ago

在执行安装中遇到以下报错:

TASK [primary-master-node/setup_k8s : Use ocadm init first master node] ********
fatal: [192.168.1.11]: FAILED! => {"changed": true, "cmd": "/opt/yunion/bin/ocadm init --control-plane-endpoint 192.168.1.11:6443 --mysql-host 192.168.1.11 --mysql-user root --mysql-password Bndwecu3dsdQw --mysql-port 3306 --image-repository registry.cn-beijing.aliyuncs.com/yunion --apiserver-advertise-address 192.168.1.11  --node-ip 192.168.1.11 --host-networks eno1/br0/192.168.1.11  --enable-hugepage --onecloud-version v3.11.6 --operator-version v3.11.6 --pod-network-cidr 10.40.0.0/16 --service-cidr 10.96.0.0/12 --service-dns-domain cluster.local --addon-calico-ip-autodetection-method 'can-reach=192.168.1.11' --enable-host-agent\n", "delta": "0:00:00.378000", "end": "2024-08-17 02:31:49.373774", "msg": "non-zero return code", "rc": 1, "start": "2024-08-17 02:31:48.995774", "stderr": "\t[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.24. Latest validated version: 18.09\nerror execution phase preflight: k8s init node checks: [preflight] Some fatal errors occurred:\n\t[ERROR SystemVerification]: unsupported kernel release: 6.8.0-40-generic\n[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`", "stderr_lines": ["\t[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.24. Latest validated version: 18.09", "error execution phase preflight: k8s init node checks: [preflight] Some fatal errors occurred:", "\t[ERROR SystemVerification]: unsupported kernel release: 6.8.0-40-generic", "[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`"], "stdout": "[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6\n[preflight] Running pre-flight checks\n[preflight] The system verification failed. Printing the output from the verification:\n\u001b[0;37mKERNEL_VERSION\u001b[0m: \u001b[0;31m6.8.0-40-generic\u001b[0m\n\u001b[0;37mCONFIG_NAMESPACES\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_NET_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_PID_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_IPC_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_UTS_NS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUPS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_DEVICE\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CGROUP_SCHED\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_CPUSETS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_MEMCG\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_INET\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_EXT4_FS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_PROC_FS\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCONFIG_NETFILTER_XT_TARGET_REDIRECT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m\n\u001b[0;37mCONFIG_NETFILTER_XT_MATCH_COMMENT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m\n\u001b[0;37mCONFIG_OVERLAY_FS\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m\n\u001b[0;37mCONFIG_AUFS_FS\u001b[0m: \u001b[0;33mnot set - Required for aufs.\u001b[0m\n\u001b[0;37mCONFIG_BLK_DEV_DM\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mDOCKER_VERSION\u001b[0m: \u001b[0;32m20.10.24\u001b[0m\n\u001b[0;37mOS\u001b[0m: \u001b[0;32mLinux\u001b[0m\n\u001b[0;37mCGROUPS_CPU\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_CPUSET\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_DEVICES\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m\n\u001b[0;37mCGROUPS_MEMORY\u001b[0m: \u001b[0;32menabled\u001b[0m", "stdout_lines": ["[init] Using Kubernetes and Onecloud version: v1.15.8 & v3.11.6", "[preflight] Running pre-flight checks", "[preflight] The system verification failed. Printing the output from the verification:", "\u001b[0;37mKERNEL_VERSION\u001b[0m: \u001b[0;31m6.8.0-40-generic\u001b[0m", "\u001b[0;37mCONFIG_NAMESPACES\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_NET_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_PID_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_IPC_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_UTS_NS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUPS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_DEVICE\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CGROUP_SCHED\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_CPUSETS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_MEMCG\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_INET\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_EXT4_FS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_PROC_FS\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCONFIG_NETFILTER_XT_TARGET_REDIRECT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m", "\u001b[0;37mCONFIG_NETFILTER_XT_MATCH_COMMENT\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m", "\u001b[0;37mCONFIG_OVERLAY_FS\u001b[0m: \u001b[0;32menabled (as module)\u001b[0m", "\u001b[0;37mCONFIG_AUFS_FS\u001b[0m: \u001b[0;33mnot set - Required for aufs.\u001b[0m", "\u001b[0;37mCONFIG_BLK_DEV_DM\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mDOCKER_VERSION\u001b[0m: \u001b[0;32m20.10.24\u001b[0m", "\u001b[0;37mOS\u001b[0m: \u001b[0;32mLinux\u001b[0m", "\u001b[0;37mCGROUPS_CPU\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_CPUACCT\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_CPUSET\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_DEVICES\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_FREEZER\u001b[0m: \u001b[0;32menabled\u001b[0m", "\u001b[0;37mCGROUPS_MEMORY\u001b[0m: \u001b[0;32menabled\u001b[0m"]}

PLAY RECAP *********************************************************************
192.168.1.11               : ok=119  changed=18   unreachable=0    failed=1    skipped=45   rescued=0    ignored=0

kubelet报错如下:

I0817 02:48:03.032828   20762 docker_service.go:258] Docker Info: &{ID:XYMC:V3Q7:G2VP:P4XC:E6YF:XAAF:2RT2:Q35Y:NTR3:7WXL:QQ4B:GH3D Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:13 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true] [userxattr false]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:22 OomKillDisable:true NGoroutines:35 SystemTime:2024-08-17T02:48:03.024057643+08:00 LoggingDriver:json-file CgroupDriver:systemd NEventsListener:0 KernelVersion:6.8.0-40-generic OperatingSystem:Ubuntu 22.04.4 LTS OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0007cbd50 NCPU:36 MemTotal:134953635840 GenericResources:[] DockerRootDir:/opt/docker HTTPProxy: HTTPSProxy: NoProxy: Name:kazeserver Labels:[] ExperimentalBuild:true ServerVersion:20.10.24 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[]} io.containerd.runtime.v1.linux:{Path:runc Args:[]} runc:{Path:runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:true Isolation: InitBinary:docker-init ContainerdCommit:{ID:2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc Expected:2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc} RuncCommit:{ID:v1.0.3-0-gf46b6ba Expected:v1.0.3-0-gf46b6ba} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=apparmor name=seccomp,profile=default] ProductLicense: Warnings:[]}
F0817 02:48:03.033610   20762 server.go:273] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"`

docker info |grep systemd: Cgroup Driver: systemd

/etc/systemd/system/kubelet.service.d/10-kubeadm.conf: Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=cgroupfs"

我尝试将/etc/docker/daemon.json中的native.cgroupdriver=systemd修改成为native.cgroupdriver= cgroupfs后, kubelet的恢复正常,但当我重新指向安装脚本时还是会出现一样的报错。 这时查看/etc/docker/daemon.json会发现我修改的native.cgroupdriver= cgroupfs已经恢复成了native.cgroupdriver=systemd

我也有尝试将vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf中的 Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=cgroupfs" 修改为 Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=systemd"

但daemon-reload后restart kubelet,kubelet还是会报错: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

zexi commented 4 weeks ago

@rekazer0 https://www.cloudpods.org/docs/getting-started/onpremise/buildah-k3s 先用这种方式部署吧,不依赖 docker 。