Closed zhaileilei123 closed 8 months ago
After I tried to change the number of coredns copies to 2, I ran a Pod on the node node, cni0 loaded normally, and the routing table loaded normally. That's why.
[root@node1 ~]# k3s crictl ps
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
5d885106163be 97e04611ad434 43 minutes ago Running coredns 0 c240196d50525 coredns-74448699cf-h4spv
[root@node1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 28:6e:d4:89:fd:e3 brd ff:ff:ff:ff:ff:ff
inet 10.106.112.153/24 brd 10.106.112.255 scope global dynamic noprefixroute enp4s0
valid_lft 7770926sec preferred_lft 7770926sec
3: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 9a:db:12:07:66:f9 brd ff:ff:ff:ff:ff:ff
inet 10.42.1.1/24 brd 10.42.1.255 scope global cni0
valid_lft forever preferred_lft forever
4: veth8673923a@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 4e:a4:5b:25:76:5a brd ff:ff:ff:ff:ff:ff link-netns cni-b78d77f9-2647-860d-a6cd-532ee63a2755
[root@node1 ~]# ip route
default via 10.106.112.1 dev enp4s0 proto dhcp metric 100
10.42.0.0/24 via 10.106.112.154 dev enp4s0
10.42.1.0/24 dev cni0 proto kernel scope link src 10.42.1.1
10.106.112.0/24 dev enp4s0 proto kernel scope link src 10.106.112.153 metric 100
`1月 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889783 28253 shared_informer.go:259] Caches are synced for node config
1月 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Tunnel authorizer set Kubelet Port 10250"
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.483277 28253 apiserver.go:52] "Watching apiserver"
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.494590 28253 kube.go:133] Node controller sync successful
1月 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
1月 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Running flannel backend."
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499467 28253 route_network.go:55] Watching for new subnet leases
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499593 28253 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, En>
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499675 28253 route_network.go:92] Subnet added: 10.42.0.0/24 via 10.106.112.154
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499827 28253 route_network.go:165] Route to {Ifindex: 2 Dst: 10.42.0.0/24 Src: <nil> Gw: 10.106.112.154 F>
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.504808 28253 iptables.go:270] bootstrap done
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.510287 28253 iptables.go:270] bootstrap done
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.589743 28253 reconciler.go:169] "Reconciler: start to sync state"
1月 25 11:31:52 node1 k3s[28253]: W0125 11:31:52.563102 28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.437660 28253 topology_manager.go:200] "Topology Admit Handler"
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591913 28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"co>
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591963 28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cu>
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591993 28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ku`
I'm confused, is it working or not? Your log lines are all truncated to terminal width so its hard to tell too much about what's going on but I don't see any notable errors.
Also:
hi~ We used the --data-dir /data/k3s parameter to specify the location of the data directory But don't know what the effects of mixed use will be? Our initial deployment of k3s was using the script under the k3s ansible project. In order to distinguish the configurations of master or work nodes, the work node is set to k3s-node.service。
I'm confused, too. I'll provide you with a full log
Jan 25 10:58:42 node1 k3s[22990]: time="2024-01-25T10:58:42+08:00" level=info msg="Acquiring lock file /data/k3s/data/.lock"
Jan 25 10:58:42 node1 k3s[22990]: time="2024-01-25T10:58:42+08:00" level=info msg="Preparing data dir /data/k3s/data/8fff5401dcc2bc383e288fabfd45333c7f1ef8a6d00388f9e3cb57e5cd7e6a4a"
Jan 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Starting k3s agent v1.24.17+k3s1 (026bb0ec)"
Jan 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Adding server to load balancer k3s-agent-load-balancer: 10.106.112.154:6443"
Jan 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -> [10.106.112.154:6443] [default: 10.106.112.154:6443]"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module overlay was already loaded"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module nf_conntrack was already loaded"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module br_netfilter was already loaded"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module iptable_filter was already loaded"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_max' to 262144"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Using private registry config file at /etc/rancher/k3s/registries.yaml"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Logging containerd to /data/k3s/agent/containerd/containerd.log"
Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Running containerd -c /data/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /data/k3s/agent/containerd"
Jan 25 10:58:46 node1 k3s[22990]: time="2024-01-25T10:58:46+08:00" level=info msg="containerd is now running"
Jan 25 10:58:46 node1 k3s[22990]: time="2024-01-25T10:58:46+08:00" level=info msg="Getting list of apiserver endpoints from server"
Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Updated load balancer k3s-agent-load-balancer default server address -> 10.106.112.154:6443"
Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Connecting to proxy" url="wss://10.106.112.154:6443/v1-k3s/connect"
Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Running kubelet --address=0.0.0.0 --allowed-unsafe-sysctls=net.ipv4.ip_forward,net.ipv6.conf.all.forwarding --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd --client-ca-file=/data/k3s/agent/client-ca.crt --cluster-dns=10.43.0.10 --cluster-domain=cluster.local --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock --containerd=/run/k3s/containerd/containerd.sock --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --healthz-bind-address=127.0.0.1 --hostname-override=node1 --kubeconfig=/data/k3s/agent/kubelet.kubeconfig --node-labels= --pod-infra-container-image=rancher/mirrored-pause:3.6 --pod-manifest-path=/data/k3s/agent/pod-manifests --read-only-port=0 --resolv-conf=/etc/resolv.conf --serialize-image-pulls=false --tls-cert-file=/data/k3s/agent/serving-kubelet.crt --tls-private-key-file=/data/k3s/agent/serving-kubelet.key"
Jan 25 10:58:47 node1 k3s[22990]: Flag --containerd has been deprecated, This is a cadvisor flag that was mistakenly registered with the Kubelet. Due to legacy concerns, it will follow the standard CLI deprecation timeline before being removed.
Jan 25 10:58:47 node1 k3s[22990]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.083897 22990 server.go:192] "--pod-infra-container-image will not be pruned by the image garbage collector
in kubelet and should also be set in the remote runtime"
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.094861 22990 server.go:395] "Kubelet version" kubeletVersion="v1.24.17+k3s1"
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.094893 22990 server.go:397] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.096593 22990 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/data/k3s/agent/
client-ca.crt"
Jan 25 10:58:47 node1 k3s[22990]: W0125 10:58:47.099012 22990 machine.go:65] Cannot read vendor id correctly, set empty.
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101039 22990 server.go:644] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting t
o /"
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101464 22990 container_manager_linux.go:262] "Container manager verified user specified cgroup-root exists
" cgroupRoot=[]
Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101574 22990 container_manager_linux.go:267] "Creating Container Manager object based on Node Config" node
Config={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:sys
temd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs:
EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quant
ity:
Below is the full log after I adjusted the number of coredns copies to 2:
Jan 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889748 28253 shared_informer.go:259] Caches are synced for endpoint slice config
Jan 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889770 28253 shared_informer.go:259] Caches are synced for service config
Jan 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889783 28253 shared_informer.go:259] Caches are synced for node config
Jan 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Tunnel authorizer set Kubelet Port 10250"
Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.483277 28253 apiserver.go:52] "Watching apiserver"
Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.494590 28253 kube.go:133] Node controller sync successful
Jan 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
Jan 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Running flannel backend."
Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499467 28253 route_network.go:55] Watching for new subnet leases
Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499593 28253 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xa2a0000, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(ip.IP6)(nil), PrefixLen:0x0}, Attrs:subnet.LeaseAttrs{PublicIP:0xa6a709a, PublicIPv6:(ip.IP6)(nil), BackendType:"host-gw", BackendData:json.RawMessage{0x6e, 0x75, 0x6c, 0x6c}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} }
Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499675 28253 route_network.go:92] Subnet added: 10.42.0.0/24 via 10.106.112.154
Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499827 28253 route_network.go:165] Route to {Ifindex: 2 Dst: 10.42.0.0/24 Src:
What is the problem that you're attempting to solve here? I don't see anything wrong.
Note that k3s doesn't have masters and workers, it has servers and agents. If you try to use different names for the two node types, that may cause additional confusion.
There are mainly some accidental phenomena. In the flannel network mode where vxlan or host-gw is used, the cni0 virtual NIC is not loaded on the load node. After deploying the k3s cluster, ping the cni0 ip address to check the inter-host connectivity in the host-gw scenario. If the IP address is not loaded, Ping fails.
I don't believe any of these are issues with k3s though; this is just how flannel works. If you would like to see this behavior changed somehow, I would suggest opening an issue with the flannel project.
Environmental Info: K3s Version: v1.24.17+k3s1
Node(s) CPU architecture, OS, and Version: aarch64
Cluster Configuration: 1master1node
Describe the bug: k3s service running. but
not found k3s-node
flannel, the mode I chose was host-gw, and vxlan had the same problem before
why~.
Steps To Reproduce:
Expected behavior:
Actual behavior:
Additional context / logs: