k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
27.81k stars 2.33k forks source link

cni0 is not loaded properly #9294

Closed zhaileilei123 closed 8 months ago

zhaileilei123 commented 8 months ago

Environmental Info: K3s Version: v1.24.17+k3s1

# k3s -v
k3s version v1.24.17+k3s1 (026bb0ec)
go version go1.20.7

Node(s) CPU architecture, OS, and Version: aarch64

Cluster Configuration: 1master1node

Describe the bug: k3s service running. but

ip a
[root@node1 aux]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 28:6e:d4:89:fd:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.106.xx.xx/24 brd 10.106.112.255 scope global dynamic noprefixroute enp4s0
       valid_lft 7773865sec preferred_lft 7773865sec

not found k3s-node

[root@node1 ~]# systemctl status k3s-node
● k3s-node.service - Lightweight Kubernetes
   Loaded: loaded (/etc/systemd/system/k3s-node.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2024-01-25 11:26:52 CST; 37min ago
     Docs: https://k3s.io
  Process: 28249 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
  Process: 28251 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
 Main PID: 28253 (k3s-agent)
    Tasks: 44
   Memory: 91.8M
   CGroup: /system.slice/k3s-node.service
           ├─28253 /usr/local/bin/k3s agent
           ├─28289 containerd -c /data/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /data/>
           └─32569 /var/lib/rancher/k3s/data/8fff5401dcc2bc383e288fabfd45333c7f1ef8a6d00388f9e3cb57e5cd7e6a4a/bin/containerd-shim-runc-v2 -namespace k8s.io >

1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.437660   28253 topology_manager.go:200] "Topology Admit Handler"
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591913   28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"co>
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591963   28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cu>
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591993   28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ku>
1月 25 11:36:52 node1 k3s[28253]: W0125 11:36:52.570832   28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:41:52 node1 k3s[28253]: W0125 11:41:52.562355   28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:46:52 node1 k3s[28253]: W0125 11:46:52.563485   28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:51:52 node1 k3s[28253]: W0125 11:51:52.562643   28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:56:52 node1 k3s[28253]: W0125 11:56:52.562238   28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 12:01:52 node1 k3s[28253]: W0125 12:01:52.567483   28253 machine.go:65] Cannot read vendor id correctly, set empty.

ip route 

default via 10.106.112.1 dev enp4s0 proto dhcp metric 100
10.42.0.0/24 via 10.106.112.154 dev enp4s0
10.106.xxx.0/24 dev enp4s0 proto kernel scope link src 10.106.xx.xx metric 100
[root@node1 aux]# ping 10.42.0.1
PING 10.42.0.1 (10.42.0.1) 56(84) bytes of data.
64 bytes from 10.42.0.1: icmp_seq=1 ttl=64 time=0.199 ms
^C
--- 10.42.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.199/0.199/0.199/0.000 ms
[root@node1 aux]# ping 10.42.1.1
PING 10.42.1.1 (10.42.1.1) 56(84) bytes of data.
^C
--- 10.42.1.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1015ms

flannel, the mode I chose was host-gw, and vxlan had the same problem before

why~.

Steps To Reproduce:

Expected behavior:

Actual behavior:

Additional context / logs:

k3s 启动日志
1月 25 10:58:42 node1 systemd[1]: Starting Lightweight Kubernetes...
1月 25 10:58:42 node1 k3s[22990]: time="2024-01-25T10:58:42+08:00" level=info msg="Acquiring lock file /data/k3s/data/.lock"
1月 25 10:58:42 node1 k3s[22990]: time="2024-01-25T10:58:42+08:00" level=info msg="Preparing data dir /data/k3s/data/8fff5401dcc2bc383e288fabfd45333c7f1ef8a>
1月 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Starting k3s agent v1.24.17+k3s1 (026bb0ec)"
1月 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Adding server to load balancer k3s-agent-load-balancer: 10.106.112.154:64>
1月 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -> [10.106.1>
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module overlay was already loaded"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module nf_conntrack was already loaded"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module br_netfilter was already loaded"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module iptable_filter was already loaded"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_max' to 262144"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Using private registry config file at /etc/rancher/k3s/registries.yaml"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Logging containerd to /data/k3s/agent/containerd/containerd.log"
1月 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Running containerd -c /data/k3s/agent/etc/containerd/config.toml -a /run/>
1月 25 10:58:46 node1 k3s[22990]: time="2024-01-25T10:58:46+08:00" level=info msg="containerd is now running"
1月 25 10:58:46 node1 k3s[22990]: time="2024-01-25T10:58:46+08:00" level=info msg="Getting list of apiserver endpoints from server"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Updated load balancer k3s-agent-load-balancer default server address -> 1>
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Connecting to proxy" url="wss://10.106.112.154:6443/v1-k3s/connect"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Running kubelet --address=0.0.0.0 --allowed-unsafe-sysctls=net.ipv4.ip_fo>
1月 25 10:58:47 node1 k3s[22990]: Flag --containerd has been deprecated, This is a cadvisor flag that was mistakenly registered with the Kubelet. Due to leg>
1月 25 10:58:47 node1 k3s[22990]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox im>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.083897   22990 server.go:192] "--pod-infra-container-image will not be pruned by the image garbage collecto>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.094861   22990 server.go:395] "Kubelet version" kubeletVersion="v1.24.17+k3s1"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.094893   22990 server.go:397] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.096593   22990 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/data/k3s/agent>
1月 25 10:58:47 node1 k3s[22990]: W0125 10:58:47.099012   22990 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101039   22990 server.go:644] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting >
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101464   22990 container_manager_linux.go:262] "Container manager verified user specified cgroup-root exist>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101574   22990 container_manager_linux.go:267] "Creating Container Manager object based on Node Config" nod>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101632   22990 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyNam>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101649   22990 container_manager_linux.go:302] "Creating device plugin manager" devicePluginEnabled=true
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101736   22990 state_mem.go:36] "Initialized new in-memory state store"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104310   22990 kubelet.go:376] "Attempting to sync node with API server"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104347   22990 kubelet.go:267] "Adding static pod path" path="/data/k3s/agent/pod-manifests"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104409   22990 kubelet.go:278] "Adding apiserver pod source"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104445   22990 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.105120   22990 kuberuntime_manager.go:239] "Container runtime initialized" containerRuntime="containerd" ve>
1月 25 10:58:47 node1 k3s[22990]: W0125 10:58:47.105310   22990 probe.go:268] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.105761   22990 server.go:1179] "Started kubelet"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.105972   22990 server.go:150] "Starting to listen" address="0.0.0.0" port=10250
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.106481   22990 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="u>
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.106509   22990 kubelet.go:1298] "Image garbage collection failed once. Stats initialization may not have co>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.106926   22990 server.go:410] "Adding debug handlers to kubelet server"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.108569   22990 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.108680   22990 volume_manager.go:292] "Starting Kubelet Volume Manager"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.109110   22990 desired_state_of_world_populator.go:149] "Desired state populator starts to run"
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.117315   22990 nodelease.go:49] "Failed to get node when trying to set owner ref to the node lease" err="no>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.147616   22990 kubelet_network_linux.go:76] "Initialized protocol iptables rules." protocol=IPv4
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.197650   22990 kubelet_network_linux.go:76] "Initialized protocol iptables rules." protocol=IPv6
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.197688   22990 status_manager.go:161] "Starting to sync pod status with apiserver"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.197710   22990 kubelet.go:1989] "Starting kubelet main sync loop"
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.197762   22990 kubelet.go:2013] "Skipping pod synchronization" err="[container runtime status check may not>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.207766   22990 cpu_manager.go:213] "Starting CPU manager" policy="none"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.207793   22990 cpu_manager.go:214] "Reconciling" reconcilePeriod="10s"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.207817   22990 state_mem.go:36] "Initialized new in-memory state store"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.208963   22990 policy_none.go:49] "None policy: Start"
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.209086   22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.209779   22990 memory_manager.go:168] "Starting memorymanager" policy="None"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.209814   22990 state_mem.go:35] "Initializing new in-memory state store"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.210427   22990 kubelet_node_status.go:70] "Attempting to register node" node="node1"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.224732   22990 manager.go:611] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoin>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.224939   22990 plugin_manager.go:114] "Starting Kubelet Plugin Manager"
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.226402   22990 eviction_manager.go:254] "Eviction manager: failed to get summary stats" err="failed to get >
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.309392   22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Running kube-proxy --cluster-cidr=10.42.0.0/16 --conntrack-max-per-core=0>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.373222   22990 server.go:230] "Warning, all flags other than --config, --write-config-to, and --cleanup are>
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.391487   22990 node.go:152] Failed to retrieve node info: nodes "node1" not found
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.409494   22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found"
1月 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.510450   22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.512501   22990 kubelet_node_status.go:73] "Successfully registered node" node="node1"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be f>
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be f>
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be f>
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be f>
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Annotations and labels have been set successfully on node: node1"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Starting flannel with backend host-gw"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Flannel found PodCIDR assigned for node node1"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="The interface enp4s0 with ipv4 address 10.106.112.153 will be used by fla>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.567994   22990 kube.go:126] Waiting 10m0s for node controller to sync
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.568046   22990 kube.go:434] Starting kube subnet manager
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.571616   22990 kube.go:455] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.0.0/24]
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.611342   22990 kuberuntime_manager.go:1095] "Updating runtime config through cri with podcidr" CIDR="10.42.>
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.612435   22990 kubelet_network.go:60] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="10.42.1.0/24"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Starting the netpol controller version , built on , go1.20.7"
1月 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="k3s agent is up and running"
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.674433   22990 network_policy_controller.go:163] Starting network policy controller
1月 25 10:58:47 node1 systemd[1]: Started Lightweight Kubernetes.
1月 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.694341   22990 network_policy_controller.go:175] Starting network policy controller full sync goroutine
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.104841   22990 apiserver.go:52] "Watching apiserver"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.515323   22990 reconciler.go:169] "Reconciler: start to sync state"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.559577   22990 node.go:163] Successfully retrieved node IP: 10.106.112.153
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.559616   22990 server_others.go:138] "Detected node IP" address="10.106.112.153"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562188   22990 server_others.go:208] "Using iptables Proxier"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562216   22990 server_others.go:215] "kube-proxy running in dual-stack mode" ipFamily=IPv4
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562227   22990 server_others.go:216] "Creating dualStackProxier for iptables"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562240   22990 server_others.go:515] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR define>
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562263   22990 proxier.go:259] "Setting route_localnet=1, use nodePortAddresses to filter loopback addresse>
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562385   22990 proxier.go:259] "Setting route_localnet=1, use nodePortAddresses to filter loopback addresse>
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562608   22990 server.go:662] "Version info" version="v1.24.17+k3s1"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562626   22990 server.go:664] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563651   22990 config.go:226] "Starting endpoint slice config controller"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563668   22990 config.go:317] "Starting service config controller"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563692   22990 shared_informer.go:252] Waiting for caches to sync for service config
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563677   22990 config.go:444] "Starting node config controller"
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563740   22990 shared_informer.go:252] Waiting for caches to sync for node config
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563677   22990 shared_informer.go:252] Waiting for caches to sync for endpoint slice config
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.568544   22990 kube.go:133] Node controller sync successful
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.579184   22990 kube.go:455] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.1.0/24]
1月 25 10:58:48 node1 k3s[22990]: time="2024-01-25T10:58:48+08:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
1月 25 10:58:48 node1 k3s[22990]: time="2024-01-25T10:58:48+08:00" level=info msg="Running flannel backend."
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.580868   22990 route_network.go:55] Watching for new subnet leases
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.580892   22990 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, En>
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.580949   22990 route_network.go:92] Subnet added: 10.42.0.0/24 via 10.106.112.154
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.585903   22990 iptables.go:270] bootstrap done
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.589035   22990 iptables.go:270] bootstrap done
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.664561   22990 shared_informer.go:259] Caches are synced for endpoint slice config
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.664567   22990 shared_informer.go:259] Caches are synced for node config
1月 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.664777   22990 shared_informer.go:259] Caches are synced for service config
1月 25 10:58:57 node1 k3s[22990]: time="2024-01-25T10:58:57+08:00" level=info msg="Tunnel authorizer set Kubelet Port 10250"
1月 25 11:03:47 node1 k3s[22990]: W0125 11:03:47.206735   22990 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:08:47 node1 k3s[22990]: W0125 11:08:47.208553   22990 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:13:47 node1 k3s[22990]: W0125 11:13:47.209555   22990 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:18:47 node1 k3s[22990]: W0125 11:18:47.204418   22990 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:23:47 node1 k3s[22990]: W0125 11:23:47.210052   22990 machine.go:65] Cannot read vendor id correctly, set empty.
zhaileilei123 commented 8 months ago

After I tried to change the number of coredns copies to 2, I ran a Pod on the node node, cni0 loaded normally, and the routing table loaded normally. That's why.

[root@node1 ~]# k3s crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
5d885106163be       97e04611ad434       43 minutes ago      Running             coredns             0                   c240196d50525       coredns-74448699cf-h4spv
[root@node1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 28:6e:d4:89:fd:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.106.112.153/24 brd 10.106.112.255 scope global dynamic noprefixroute enp4s0
       valid_lft 7770926sec preferred_lft 7770926sec
3: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 9a:db:12:07:66:f9 brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.1/24 brd 10.42.1.255 scope global cni0
       valid_lft forever preferred_lft forever
4: veth8673923a@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
    link/ether 4e:a4:5b:25:76:5a brd ff:ff:ff:ff:ff:ff link-netns cni-b78d77f9-2647-860d-a6cd-532ee63a2755
[root@node1 ~]# ip route
default via 10.106.112.1 dev enp4s0 proto dhcp metric 100
10.42.0.0/24 via 10.106.112.154 dev enp4s0
10.42.1.0/24 dev cni0 proto kernel scope link src 10.42.1.1
10.106.112.0/24 dev enp4s0 proto kernel scope link src 10.106.112.153 metric 100

`1月 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889783   28253 shared_informer.go:259] Caches are synced for node config
1月 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Tunnel authorizer set Kubelet Port 10250"
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.483277   28253 apiserver.go:52] "Watching apiserver"
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.494590   28253 kube.go:133] Node controller sync successful
1月 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
1月 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Running flannel backend."
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499467   28253 route_network.go:55] Watching for new subnet leases
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499593   28253 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, En>
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499675   28253 route_network.go:92] Subnet added: 10.42.0.0/24 via 10.106.112.154
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499827   28253 route_network.go:165] Route to {Ifindex: 2 Dst: 10.42.0.0/24 Src: <nil> Gw: 10.106.112.154 F>
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.504808   28253 iptables.go:270] bootstrap done
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.510287   28253 iptables.go:270] bootstrap done
1月 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.589743   28253 reconciler.go:169] "Reconciler: start to sync state"
1月 25 11:31:52 node1 k3s[28253]: W0125 11:31:52.563102   28253 machine.go:65] Cannot read vendor id correctly, set empty.
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.437660   28253 topology_manager.go:200] "Topology Admit Handler"
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591913   28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"co>
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591963   28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cu>
1月 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591993   28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ku`
brandond commented 8 months ago

I'm confused, is it working or not? Your log lines are all truncated to terminal width so its hard to tell too much about what's going on but I don't see any notable errors.

Also:

zhaileilei123 commented 8 months ago

hi~ We used the --data-dir /data/k3s parameter to specify the location of the data directory But don't know what the effects of mixed use will be? Our initial deployment of k3s was using the script under the k3s ansible project. In order to distinguish the configurations of master or work nodes, the work node is set to k3s-node.service。

I'm confused, too. I'll provide you with a full log Jan 25 10:58:42 node1 k3s[22990]: time="2024-01-25T10:58:42+08:00" level=info msg="Acquiring lock file /data/k3s/data/.lock" Jan 25 10:58:42 node1 k3s[22990]: time="2024-01-25T10:58:42+08:00" level=info msg="Preparing data dir /data/k3s/data/8fff5401dcc2bc383e288fabfd45333c7f1ef8a6d00388f9e3cb57e5cd7e6a4a" Jan 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Starting k3s agent v1.24.17+k3s1 (026bb0ec)" Jan 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Adding server to load balancer k3s-agent-load-balancer: 10.106.112.154:6443" Jan 25 10:58:44 node1 k3s[22990]: time="2024-01-25T10:58:44+08:00" level=info msg="Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -> [10.106.112.154:6443] [default: 10.106.112.154:6443]" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module overlay was already loaded" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module nf_conntrack was already loaded" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module br_netfilter was already loaded" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Module iptable_filter was already loaded" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_max' to 262144" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Using private registry config file at /etc/rancher/k3s/registries.yaml" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Logging containerd to /data/k3s/agent/containerd/containerd.log" Jan 25 10:58:45 node1 k3s[22990]: time="2024-01-25T10:58:45+08:00" level=info msg="Running containerd -c /data/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /data/k3s/agent/containerd" Jan 25 10:58:46 node1 k3s[22990]: time="2024-01-25T10:58:46+08:00" level=info msg="containerd is now running" Jan 25 10:58:46 node1 k3s[22990]: time="2024-01-25T10:58:46+08:00" level=info msg="Getting list of apiserver endpoints from server" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Updated load balancer k3s-agent-load-balancer default server address -> 10.106.112.154:6443" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Connecting to proxy" url="wss://10.106.112.154:6443/v1-k3s/connect" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Running kubelet --address=0.0.0.0 --allowed-unsafe-sysctls=net.ipv4.ip_forward,net.ipv6.conf.all.forwarding --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd --client-ca-file=/data/k3s/agent/client-ca.crt --cluster-dns=10.43.0.10 --cluster-domain=cluster.local --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock --containerd=/run/k3s/containerd/containerd.sock --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --healthz-bind-address=127.0.0.1 --hostname-override=node1 --kubeconfig=/data/k3s/agent/kubelet.kubeconfig --node-labels= --pod-infra-container-image=rancher/mirrored-pause:3.6 --pod-manifest-path=/data/k3s/agent/pod-manifests --read-only-port=0 --resolv-conf=/etc/resolv.conf --serialize-image-pulls=false --tls-cert-file=/data/k3s/agent/serving-kubelet.crt --tls-private-key-file=/data/k3s/agent/serving-kubelet.key" Jan 25 10:58:47 node1 k3s[22990]: Flag --containerd has been deprecated, This is a cadvisor flag that was mistakenly registered with the Kubelet. Due to legacy concerns, it will follow the standard CLI deprecation timeline before being removed. Jan 25 10:58:47 node1 k3s[22990]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI. Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.083897 22990 server.go:192] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.094861 22990 server.go:395] "Kubelet version" kubeletVersion="v1.24.17+k3s1" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.094893 22990 server.go:397] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK="" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.096593 22990 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/data/k3s/agent/ client-ca.crt" Jan 25 10:58:47 node1 k3s[22990]: W0125 10:58:47.099012 22990 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101039 22990 server.go:644] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting t o /" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101464 22990 container_manager_linux.go:262] "Container manager verified user specified cgroup-root exists " cgroupRoot=[] Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101574 22990 container_manager_linux.go:267] "Creating Container Manager object based on Node Config" node Config={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:sys temd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quant ity: Percentage:0.05} GracePeriod:0s MinReclaim:} {Signal:nodefs.available Operator:LessThan Value:{Quantity: Percentage:0.05} GracePeriod:0s MinReclaim:}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 Enforce CPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none} Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101632 22990 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName ="none" topologyScopeName="container" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101649 22990 container_manager_linux.go:302] "Creating device plugin manager" devicePluginEnabled=true Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.101736 22990 state_mem.go:36] "Initialized new in-memory state store" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104310 22990 kubelet.go:376] "Attempting to sync node with API server" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104347 22990 kubelet.go:267] "Adding static pod path" path="/data/k3s/agent/pod-manifests" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104409 22990 kubelet.go:278] "Adding apiserver pod source" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.104445 22990 apiserver.go:42] "Waiting for node sync before watching apiserver pods" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.105120 22990 kuberuntime_manager.go:239] "Container runtime initialized" containerRuntime="containerd" ver sion="v1.7.3-k3s1" apiVersion="v1" Jan 25 10:58:47 node1 k3s[22990]: W0125 10:58:47.105310 22990 probe.go:268] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/e xec/ does not exist. Recreating. Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.105761 22990 server.go:1179] "Started kubelet" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.105972 22990 server.go:150] "Starting to listen" address="0.0.0.0" port=10250 Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.106481 22990 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="un able to find data in memory cache" mountpoint="/data/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.106509 22990 kubelet.go:1298] "Image garbage collection failed once. Stats initialization may not have com pleted yet" err="invalid capacity 0 on image filesystem" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.106926 22990 server.go:410] "Adding debug handlers to kubelet server" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.108569 22990 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.108680 22990 volume_manager.go:292] "Starting Kubelet Volume Manager" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.109110 22990 desired_state_of_world_populator.go:149] "Desired state populator starts to run" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.117315 22990 nodelease.go:49] "Failed to get node when trying to set owner ref to the node lease" err="nod es \"node1\" not found" node="node1" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.147616 22990 kubelet_network_linux.go:76] "Initialized protocol iptables rules." protocol=IPv4 Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.197650 22990 kubelet_network_linux.go:76] "Initialized protocol iptables rules." protocol=IPv6 Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.197688 22990 status_manager.go:161] "Starting to sync pod status with apiserver" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.197710 22990 kubelet.go:1989] "Starting kubelet main sync loop" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.197762 22990 kubelet.go:2013] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.207766 22990 cpu_manager.go:213] "Starting CPU manager" policy="none" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.207793 22990 cpu_manager.go:214] "Reconciling" reconcilePeriod="10s" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.207817 22990 state_mem.go:36] "Initialized new in-memory state store" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.208963 22990 policy_none.go:49] "None policy: Start" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.209086 22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.209779 22990 memory_manager.go:168] "Starting memorymanager" policy="None" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.209814 22990 state_mem.go:35] "Initializing new in-memory state store" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.210427 22990 kubelet_node_status.go:70] "Attempting to register node" node="node1" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.224732 22990 manager.go:611] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint " err="checkpoint is not found" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.224939 22990 plugin_manager.go:114] "Starting Kubelet Plugin Manager" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.226402 22990 eviction_manager.go:254] "Eviction manager: failed to get summary stats" err="failed to get n ode info: node \"node1\" not found" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.309392 22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Running kube-proxy --cluster-cidr=10.42.0.0/16 --conntrack-max-per-core=0 --conntrack-tcp-timeout-close-wait=0s --conntrack-tcp-timeout-established=0s --healthz-bind-address=127.0.0.1 --hostname-override=node1 --kubeconfig=/data/k3 s/agent/kubeproxy.kubeconfig --proxy-mode=iptables" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.373222 22990 server.go:230] "Warning, all flags other than --config, --write-config-to, and --cleanup are deprecated, please begin using a config file ASAP" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.391487 22990 node.go:152] Failed to retrieve node info: nodes "node1" not found Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.409494 22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found" Jan 25 10:58:47 node1 k3s[22990]: E0125 10:58:47.510450 22990 kubelet.go:2427] "Error getting node" err="node \"node1\" not found" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.512501 22990 kubelet_node_status.go:73] "Successfully registered node" node="node1" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be fu lfilled on nodes \"node1\": the object has been modified; please apply your changes to the latest version and try again" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be fu lfilled on nodes \"node1\": the object has been modified; please apply your changes to the latest version and try again" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be fu lfilled on nodes \"node1\": the object has been modified; please apply your changes to the latest version and try again" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Failed to set annotations and labels on node node1: Operation cannot be fu lfilled on nodes \"node1\": the object has been modified; please apply your changes to the latest version and try again" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Annotations and labels have been set successfully on node: node1" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Starting flannel with backend host-gw" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Flannel found PodCIDR assigned for node node1" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="The interface enp4s0 with ipv4 address 10.106.112.153 will be used by flan nel" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.567994 22990 kube.go:126] Waiting 10m0s for node controller to sync Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.568046 22990 kube.go:434] Starting kube subnet manager Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.571616 22990 kube.go:455] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.0.0/24] Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.611342 22990 kuberuntime_manager.go:1095] "Updating runtime config through cri with podcidr" CIDR="10.42.1 .0/24" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.612435 22990 kubelet_network.go:60] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="10.42.1.0/24" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="Starting the netpol controller version , built on , go1.20.7" Jan 25 10:58:47 node1 k3s[22990]: time="2024-01-25T10:58:47+08:00" level=info msg="k3s agent is up and running" Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.674433 22990 network_policy_controller.go:163] Starting network policy controller Jan 25 10:58:47 node1 k3s[22990]: I0125 10:58:47.694341 22990 network_policy_controller.go:175] Starting network policy controller full sync goroutine Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.104841 22990 apiserver.go:52] "Watching apiserver" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.515323 22990 reconciler.go:169] "Reconciler: start to sync state" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.559577 22990 node.go:163] Successfully retrieved node IP: 10.106.112.153 Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.559616 22990 server_others.go:138] "Detected node IP" address="10.106.112.153" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562188 22990 server_others.go:208] "Using iptables Proxier" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562216 22990 server_others.go:215] "kube-proxy running in dual-stack mode" ipFamily=IPv4 Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562227 22990 server_others.go:216] "Creating dualStackProxier for iptables" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562240 22990 server_others.go:515] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined , , defaulting to no-op detect-local for IPv6" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562263 22990 proxier.go:259] "Setting route_localnet=1, use nodePortAddresses to filter loopback addresses for NodePorts to skip it https://issues.k8s.io/90259" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562385 22990 proxier.go:259] "Setting route_localnet=1, use nodePortAddresses to filter loopback addresses for NodePorts to skip it https://issues.k8s.io/90259" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562608 22990 server.go:662] "Version info" version="v1.24.17+k3s1" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.562626 22990 server.go:664] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK="" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563651 22990 config.go:226] "Starting endpoint slice config controller" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563668 22990 config.go:317] "Starting service config controller" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563692 22990 shared_informer.go:252] Waiting for caches to sync for service config Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563677 22990 config.go:444] "Starting node config controller" Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563740 22990 shared_informer.go:252] Waiting for caches to sync for node config Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.563677 22990 shared_informer.go:252] Waiting for caches to sync for endpoint slice config Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.568544 22990 kube.go:133] Node controller sync successful Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.579184 22990 kube.go:455] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.1.0/24] Jan 25 10:58:48 node1 k3s[22990]: time="2024-01-25T10:58:48+08:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env" Jan 25 10:58:48 node1 k3s[22990]: time="2024-01-25T10:58:48+08:00" level=info msg="Running flannel backend." Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.580868 22990 route_network.go:55] Watching for new subnet leases Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.580892 22990 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, Ena bleIPv6:false, Subnet:ip.IP4Net{IP:0xa2a0000, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(ip.IP6)(nil), PrefixLen:0x0}, Attrs:subnet.LeaseAttrs{PublicIP:0xa6a 709a, PublicIPv6:(ip.IP6)(nil), BackendType:"host-gw", BackendData:json.RawMessage{0x6e, 0x75, 0x6c, 0x6c}, BackendV6Data:json.RawMessage(nil)}, Expiration: time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} } Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.580949 22990 route_network.go:92] Subnet added: 10.42.0.0/24 via 10.106.112.154 Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.585903 22990 iptables.go:270] bootstrap done Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.589035 22990 iptables.go:270] bootstrap done Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.664561 22990 shared_informer.go:259] Caches are synced for endpoint slice config Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.664567 22990 shared_informer.go:259] Caches are synced for node config Jan 25 10:58:48 node1 k3s[22990]: I0125 10:58:48.664777 22990 shared_informer.go:259] Caches are synced for service config Jan 25 10:58:57 node1 k3s[22990]: time="2024-01-25T10:58:57+08:00" level=info msg="Tunnel authorizer set Kubelet Port 10250" Jan 25 11:03:47 node1 k3s[22990]: W0125 11:03:47.206735 22990 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:08:47 node1 k3s[22990]: W0125 11:08:47.208553 22990 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:13:47 node1 k3s[22990]: W0125 11:13:47.209555 22990 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:18:47 node1 k3s[22990]: W0125 11:18:47.204418 22990 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:23:47 node1 k3s[22990]: W0125 11:23:47.210052 22990 machine.go:65] Cannot read vendor id correctly, set empty.

Below is the full log after I adjusted the number of coredns copies to 2:

Jan 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889748 28253 shared_informer.go:259] Caches are synced for endpoint slice config Jan 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889770 28253 shared_informer.go:259] Caches are synced for service config Jan 25 11:26:52 node1 k3s[28253]: I0125 11:26:52.889783 28253 shared_informer.go:259] Caches are synced for node config Jan 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Tunnel authorizer set Kubelet Port 10250" Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.483277 28253 apiserver.go:52] "Watching apiserver" Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.494590 28253 kube.go:133] Node controller sync successful Jan 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env" Jan 25 11:26:53 node1 k3s[28253]: time="2024-01-25T11:26:53+08:00" level=info msg="Running flannel backend." Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499467 28253 route_network.go:55] Watching for new subnet leases Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499593 28253 watch.go:51] Batch elem [0] is { subnet.Event{Type:0, Lease:subnet.Lease{EnableIPv4:true, EnableIPv6:false, Subnet:ip.IP4Net{IP:0xa2a0000, PrefixLen:0x18}, IPv6Subnet:ip.IP6Net{IP:(ip.IP6)(nil), PrefixLen:0x0}, Attrs:subnet.LeaseAttrs{PublicIP:0xa6a709a, PublicIPv6:(ip.IP6)(nil), BackendType:"host-gw", BackendData:json.RawMessage{0x6e, 0x75, 0x6c, 0x6c}, BackendV6Data:json.RawMessage(nil)}, Expiration:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Asof:0}} } Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499675 28253 route_network.go:92] Subnet added: 10.42.0.0/24 via 10.106.112.154 Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.499827 28253 route_network.go:165] Route to {Ifindex: 2 Dst: 10.42.0.0/24 Src: Gw: 10.106.112.154 Flags: [] Table: 0 Realm: 0} already exists, skipping. Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.504808 28253 iptables.go:270] bootstrap done Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.510287 28253 iptables.go:270] bootstrap done Jan 25 11:26:53 node1 k3s[28253]: I0125 11:26:53.589743 28253 reconciler.go:169] "Reconciler: start to sync state" Jan 25 11:31:52 node1 k3s[28253]: W0125 11:31:52.563102 28253 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.437660 28253 topology_manager.go:200] "Topology Admit Handler" Jan 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591913 28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/4a2ad145-074a-489a-9f61-644e3357a80a-config-volume\") pod \"coredns-74448699cf-h4spv\" (UID: \"4a2ad145-074a-489a-9f61-644e3357a80a\") " pod="kube-system/coredns-74448699cf-h4spv" Jan 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591963 28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"custom-config-volume\" (UniqueName: \"kubernetes.io/configmap/4a2ad145-074a-489a-9f61-644e3357a80a-custom-config-volume\") pod \"coredns-74448699cf-h4spv\" (UID: \"4a2ad145-074a-489a-9f61-644e3357a80a\") " pod="kube-system/coredns-74448699cf-h4spv" Jan 25 11:32:26 node1 k3s[28253]: I0125 11:32:26.591993 28253 reconciler.go:352] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-vqxzz\" (UniqueName: \"kubernetes.io/projected/4a2ad145-074a-489a-9f61-644e3357a80a-kube-api-access-vqxzz\") pod \"coredns-74448699cf-h4spv\" (UID: \"4a2ad145-074a-489a-9f61-644e3357a80a\") " pod="kube-system/coredns-74448699cf-h4spv" Jan 25 11:36:52 node1 k3s[28253]: W0125 11:36:52.570832 28253 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:41:52 node1 k3s[28253]: W0125 11:41:52.562355 28253 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:46:52 node1 k3s[28253]: W0125 11:46:52.563485 28253 machine.go:65] Cannot read vendor id correctly, set empty. Jan 25 11:51:52 node1 k3s[28253]: W0125 11:51:52.562643 28253 machine.go:65] Cannot read vendor id correctly, set empty.

brandond commented 8 months ago

What is the problem that you're attempting to solve here? I don't see anything wrong.

Note that k3s doesn't have masters and workers, it has servers and agents. If you try to use different names for the two node types, that may cause additional confusion.

zhaileilei123 commented 8 months ago

There are mainly some accidental phenomena. In the flannel network mode where vxlan or host-gw is used, the cni0 virtual NIC is not loaded on the load node. After deploying the k3s cluster, ping the cni0 ip address to check the inter-host connectivity in the host-gw scenario. If the IP address is not loaded, Ping fails.

brandond commented 8 months ago

I don't believe any of these are issues with k3s though; this is just how flannel works. If you would like to see this behavior changed somehow, I would suggest opening an issue with the flannel project.