k0sproject / k0s

k0s - The Zero Friction Kubernetes
https://docs.k0sproject.io
Other
3.78k stars 366 forks source link

Default config for single node fails waiting for etcd #3202

Closed pablochacin closed 1 year ago

pablochacin commented 1 year ago

Before creating an issue, make sure you've checked the following:

Platform

Ubuntu 22.04.2 LTS

Version

v1.27.1+k0s.0

Sysinfo

`k0s sysinfo`
Machine ID: "985b8f0b6c9a51bbdbf9c431e82ec80744ed320e7e60ef3dd9d1e47ecd3cb9b4" (from machine) (pass)
Total memory: 62.5 GiB (pass)
Disk space available for /var/lib/k0s: 605.5 GiB (pass)
Operating system: Linux (pass)
  Linux kernel release: 5.19.0-43-generic (pass)
  Max. file descriptors per process: current: 1048576 / max: 1048576 (pass)
  Executable in path: modprobe: /usr/sbin/modprobe (pass)
  /proc file system: mounted (0x9fa0) (pass)
  Control Groups: version 2 (pass)
    cgroup controller "cpu": available (pass)
    cgroup controller "cpuacct": available (via cpu in version 2) (pass)
    cgroup controller "cpuset": available (pass)
    cgroup controller "memory": available (pass)
    cgroup controller "devices": available (assumed) (pass)
    cgroup controller "freezer": available (assumed) (pass)
    cgroup controller "pids": available (pass)
    cgroup controller "hugetlb": available (pass)
    cgroup controller "blkio": available (via io in version 2) (pass)
  CONFIG_CGROUPS: Control Group support: built-in (pass)
    CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
    CONFIG_CPUSETS: Cpuset support: built-in (pass)
    CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
    CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
      CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
        CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
    CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
  CONFIG_NAMESPACES: Namespaces support: built-in (pass)
    CONFIG_UTS_NS: UTS namespace: built-in (pass)
    CONFIG_IPC_NS: IPC namespace: built-in (pass)
    CONFIG_PID_NS: PID namespace: built-in (pass)
    CONFIG_NET_NS: Network namespace: built-in (pass)
  CONFIG_NET: Networking support: built-in (pass)
    CONFIG_INET: TCP/IP networking: built-in (pass)
      CONFIG_IPV6: The IPv6 protocol: built-in (pass)
    CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
      CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
      CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
      CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
        CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
        CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
        CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
      CONFIG_NETFILTER_NETLINK: module (pass)
      CONFIG_NF_NAT: module (pass)
      CONFIG_IP_SET: IP set support: module (pass)
        CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
        CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
      CONFIG_IP_VS: IP virtual server support: module (pass)
        CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
      CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
      CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
      CONFIG_IP_NF_IPTABLES: IP tables support: module (pass)
        CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
          CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
        CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
        CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
      CONFIG_NF_DEFRAG_IPV4: module (pass)
      CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
      CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
        CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
        CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
        CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
      CONFIG_NF_DEFRAG_IPV6: module (pass)
    CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
      CONFIG_LLC: module (pass)
      CONFIG_STP: module (pass)
  CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: built-in (pass)
  CONFIG_PROC_FS: /proc file system support: built-in (pass)

What happened?

I'm trying to start a single node cluster and it fails with messages that points to errors waiting for etcd. According to documentation the default storage when --single is specified is kine. Moreover, editing the default storage to specify the storage type as kine does not work either.

Steps to reproduce

  1. Install k0s
  2. Create default config k0s config create | tee ~/.k0s/k0s.yaml
  3. edit default config and change storage type to kine
  4. run controller
    sudo k0s controller -c ~/.k0s/k0s.yaml --single 

Expected behavior

The single node cluster is created using sqlite as storage backend

Actual behavior

The process fails waiting for etcd

Screenshots and logs

sudo k0s controller -c ~/.k0s/k0s.yaml --single
output (click to show dailss) ``` INFO[2023-06-08 15:44:40] using api address: 192.168.1.37 INFO[2023-06-08 15:44:40] using listen port: 6443 INFO[2023-06-08 15:44:40] using sans: [192.168.1.37 192.168.122.1 172.19.0.1 172.17.0.1 fe80::82e7:3b12:3e0e:bb1c fc00:f853:ccd:e793::1 fe80::42:45ff:fee7:fab7 fe80::1 fe80::42:60ff:fee9:fb9c] INFO[2023-06-08 15:44:40] DNS address: 10.96.0.10 INFO[2023-06-08 15:44:40] using storage backend etcd INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] generate received request component=cfssl INFO[2023-06-08 15:44:40] received CSR component=cfssl INFO[2023-06-08 15:44:40] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:40] encoded CSR component=cfssl INFO[2023-06-08 15:44:40] encoded CSR component=cfssl INFO[2023-06-08 15:44:40] signed certificate with serial number 717946544846142165583761192787134983096761700129 component=cfssl INFO[2023-06-08 15:44:40] signed certificate with serial number 191388176099851630787622396274070074274833231393 component=cfssl INFO[2023-06-08 15:44:40] encoded CSR component=cfssl INFO[2023-06-08 15:44:40] signed certificate with serial number 450204921561782781693068045759263645312954618054 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 274850914201115490785527714494679113811192121125 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 520103952883657066566361410784513241304578992092 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 165785568648753999910068495472965019849539836382 component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 374578295309788689053648431302195192115040758751 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 163862011943054246001236846751341515244425174466 component=cfssl INFO[2023-06-08 15:44:41] initializing Etcd INFO[2023-06-08 15:44:41] initializing APIServer INFO[2023-06-08 15:44:41] initializing Dummy INFO[2023-06-08 15:44:41] initializing Manager INFO[2023-06-08 15:44:41] initializing CSRApprover INFO[2023-06-08 15:44:41] initializing Status INFO[2023-06-08 15:44:41] Staging '/var/lib/k0s/bin/etcd' INFO[2023-06-08 15:44:41] Listening address /run/k0s/status.sock component=status WARN[2023-06-08 15:44:41] running kube-apiserver as root: user: unknown user kube-apiserver INFO[2023-06-08 15:44:41] Staging '/var/lib/k0s/bin/kube-apiserver' INFO[2023-06-08 15:44:41] starting Etcd INFO[2023-06-08 15:44:41] Starting etcd INFO[2023-06-08 15:44:41] generate received request component=cfssl INFO[2023-06-08 15:44:41] received CSR component=cfssl INFO[2023-06-08 15:44:41] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:41] generate received request component=cfssl INFO[2023-06-08 15:44:41] received CSR component=cfssl INFO[2023-06-08 15:44:41] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:41] generate received request component=cfssl INFO[2023-06-08 15:44:41] received CSR component=cfssl INFO[2023-06-08 15:44:41] generating key: rsa-2048 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 395098553301122446482214039083503478668047005337 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 47608107570929136268696255026150320502242378345 component=cfssl INFO[2023-06-08 15:44:41] encoded CSR component=cfssl INFO[2023-06-08 15:44:41] signed certificate with serial number 700947118901960083004565954003349837800270700416 component=cfssl INFO[2023-06-08 15:44:41] Starting to supervise component=etcd INFO[2023-06-08 15:44:41] Started successfully, go nuts pid 295268 component=etcd INFO[2023-06-08 15:44:41] {"level":"warn","ts":"2023-06-08T15:44:41.681711+0200","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"info","ts":"2023-06-08T15:44:41.68189+0200","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["/var/lib/k0s/bin/etcd","--key-file=/var/lib/k0s/pki/etcd/server.key","--peer-trusted-ca-file=/var/lib/k0s/pki/etcd/ca.crt","--auth-token=jwt,pub-key=/var/lib/k0s/pki/etcd/jwt.pub,priv-key=/var/lib/k0s/pki/etcd/jwt.key,sign-method=RS512,ttl=10m","--listen-client-urls=https://127.0.0.1:2379","--tls-min-version=TLS1.2","--listen-peer-urls=https://192.168.1.37:2380","--name=pablo-XPS-15-9520","--peer-client-cert-auth=true","--enable-pprof=false","--data-dir=/var/lib/k0s/etcd","--trusted-ca-file=/var/lib/k0s/pki/etcd/ca.crt","--cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256","--client-cert-auth=true","--initial-advertise-peer-urls=https://192.168.1.37:2380","--cert-file=/var/lib/k0s/pki/etcd/server.crt","--log-level=info","--advertise-client-urls=https://127.0.0.1:2379","--peer-key-file=/var/lib/k0s/pki/etcd/peer.key","--peer-cert-file=/var/lib/k0s/pki/etcd/peer.crt"]} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"warn","ts":"2023-06-08T15:44:41.682111+0200","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"info","ts":"2023-06-08T15:44:41.682137+0200","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["https://192.168.1.37:2380"]} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"info","ts":"2023-06-08T15:44:41.682301+0200","caller":"embed/etcd.go:495","msg":"starting with peer TLS","tls-info":"cert = /var/lib/k0s/pki/etcd/peer.crt, key = /var/lib/k0s/pki/etcd/peer.key, client-cert=, client-key=, trusted-ca = /var/lib/k0s/pki/etcd/ca.crt, client-cert-auth = true, crl-file = ","cipher-suites":["TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256"]} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"info","ts":"2023-06-08T15:44:41.682683+0200","caller":"embed/etcd.go:376","msg":"closing etcd server","name":"pablo-XPS-15-9520","data-dir":"/var/lib/k0s/etcd","advertise-peer-urls":["https://192.168.1.37:2380"],"advertise-client-urls":["https://127.0.0.1:2379"]} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"info","ts":"2023-06-08T15:44:41.682699+0200","caller":"embed/etcd.go:378","msg":"closed etcd server","name":"pablo-XPS-15-9520","data-dir":"/var/lib/k0s/etcd","advertise-peer-urls":["https://192.168.1.37:2380"],"advertise-client-urls":["https://127.0.0.1:2379"]} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"warn","ts":"2023-06-08T15:44:41.682716+0200","caller":"etcdmain/etcd.go:146","msg":"failed to start etcd","error":"listen tcp 192.168.1.37:2380: bind: cannot assign requested address"} component=etcd stream=stderr INFO[2023-06-08 15:44:41] {"level":"fatal","ts":"2023-06-08T15:44:41.682746+0200","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"listen tcp 192.168.1.37:2380: bind: cannot assign requested address","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\t/etcd/server/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\t/etcd/server/etcdmain/main.go:40\nmain.main\n\t/etcd/server/main.go:31\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"} component=etcd stream=stderr WARN[2023-06-08 15:44:41] Failed to wait for process component=etcd error="exit status 1" INFO[2023-06-08 15:44:41] respawning in 5s component=etcd {"level":"warn","ts":"2023-06-08T15:44:42.653362+0200","logger":"etcd-client","caller":"v3@v3.5.8/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000f93180/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: EOF\""} {"level":"warn","ts":"2023-06-08T15:44:43.653897+0200","logger":"etcd-client","caller":"v3@v3.5.8/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00099a1c0/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: EOF\""} ^C{"level":"warn","ts":"2023-06-08T15:44:44.101007+0200","logger":"etcd-client","caller":"v3@v3.5.8/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000f93500/127.0.0.1:2379","attempt":0,"error":"rpc error: code = Canceled desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: EOF\""} {"level":"warn","ts":"2023-06-08T15:44:44.102871+0200","logger":"etcd-client","caller":"v3@v3.5.8/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000f93880/127.0.0.1:2379","attempt":0,"error":"rpc error: code = Canceled desc = context canceled"} INFO[2023-06-08 15:44:44] stopped component Etcd Error: failed to start controller node components: Etcd health-check timed out ```

Additional context

Configuration file ``` apiVersion: k0s.k0sproject.io/v1beta1 kind: ClusterConfig metadata: creationTimestamp: null name: k0s spec: api: address: 192.168.1.41 k0sApiPort: 9443 port: 6443 sans: - 192.168.1.41 - 192.168.122.1 - 172.19.0.1 - 172.17.0.1 - fe80::82e7:3b12:3e0e:bb1c - fc00:f853:ccd:e793::1 - fe80::42:2eff:fef8:7579 - fe80::1 - fe80::42:daff:fe62:2649 - fe80::70ec:dff:fe6b:a762 tunneledNetworkingMode: false controllerManager: {} extensions: helm: charts: null concurrencyLevel: 5 repositories: null storage: create_default_storage_class: false type: external_storage installConfig: users: etcdUser: etcd kineUser: kube-apiserver konnectivityUser: konnectivity-server kubeAPIserverUser: kube-apiserver kubeSchedulerUser: kube-scheduler konnectivity: adminPort: 8133 agentPort: 8132 network: calico: null clusterDomain: cluster.local dualStack: {} kubeProxy: iptables: minSyncPeriod: 0s syncPeriod: 0s ipvs: minSyncPeriod: 0s syncPeriod: 0s tcpFinTimeout: 0s tcpTimeout: 0s udpTimeout: 0s metricsBindAddress: 0.0.0.0:10249 mode: iptables kuberouter: autoMTU: true hairpin: Enabled ipMasq: false metricsPort: 8080 mtu: 0 peerRouterASNs: "" peerRouterIPs: "" nodeLocalLoadBalancing: envoyProxy: apiServerBindPort: 7443 image: image: quay.io/k0sproject/envoy-distroless version: v1.24.1 konnectivityServerBindPort: 7132 type: EnvoyProxy podCIDR: 10.244.0.0/16 provider: kuberouter serviceCIDR: 10.96.0.0/12 scheduler: {} storage: type: kine telemetry: enabled: true status: {} ```
mikhail-sakhnov commented 1 year ago

@pablochacin I wasn't able to reproduce it with the latest release v1.27.2+k0s.0

It doesn't work for me with config you'd provided, pretty much sure it is because the different network settings (the default config is always tailored for the particular machine, because it calculates addresses).

If you run without any config provided, what kind of messages do you have in the log?

Do you run the binary for the first time on this machine, or was machine provisioned before with the etcd, by any chance? If so, you need to reset your cluster by using k0s reset

May I ask you to try running with bare minimum config like:

apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
metadata:
  creationTimestamp: null
  name: k0s
spec:
  storage:
    type: kine

Plus don't forget about k0s reset mentioned above.

pablochacin commented 1 year ago

Do you run the binary for the first time on this machine, or was machine provisioned before with the etcd, by any chance? If so, you need to reset your cluster by using k0s reset

That fixed the issue. Thanks.