k3d-io / k3d

Little helper to run CNCF's k3s in Docker
https://k3d.io/
MIT License
5.46k stars 462 forks source link

[QUESTION/HELP] :Unable to Create Kubernetes Cluster using k3d on NixOS #1421

Closed byteshiva closed 7 months ago

byteshiva commented 7 months ago

Description: Creating a Kubernetes cluster with k3d on NixOS fails during the server node startup, leaving the cluster creation incomplete.

Steps to Reproduce:

  1. Navigate to the directory containing sample.sh.
Step 1 - Details

**sample.sh** ``` export NIXPKGS_ALLOW_UNFREE=1 nix-shell -E ' let nixpkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-unstable.tar.gz") {}; in nixpkgs.mkShell { buildInputs = with nixpkgs; [ k3d kubectl kubernetes-helm docker ]; shellHook = "export KUBECONFIG=kubeconfig"; }' ```

  1. Execute sample.sh on a NixOS environment.
  2. Observe the cluster creation process stall at starting the server node.
Step 4: - Details

**Error** **k3d cluster create --api-port 6550 -p "8081:80@loadbalancer" --agents 2** ``` INFO[0000] portmapping '8081:80' targets the loadbalancer: defaulting to [servers:*:proxy agents:*:proxy] INFO[0000] Prep: Network INFO[0000] Created network 'k3d-k3s-default' INFO[0000] Created image volume k3d-k3s-default-images INFO[0000] Starting new tools node... INFO[0000] Starting Node 'k3d-k3s-default-tools' INFO[0001] Creating node 'k3d-k3s-default-server-0' INFO[0004] Pulling image 'docker.io/rancher/k3s:v1.21.7-k3s1' INFO[0019] Creating node 'k3d-k3s-default-agent-0' INFO[0019] Creating node 'k3d-k3s-default-agent-1' INFO[0019] Creating LoadBalancer 'k3d-k3s-default-serverlb' INFO[0019] Using the k3d-tools node to gather environment information INFO[0019] HostIP: using network gateway 172.26.0.1 address INFO[0019] Starting cluster 'k3s-default' INFO[0019] Starting servers... INFO[0019] Starting Node 'k3d-k3s-default-server-0' ``` **Extra:** **k3d cluster list --verbose --trace** ``` DEBU[0000] DOCKER_SOCK=/var/run/docker.sock DEBU[0000] Runtime Info: &{Name:docker Endpoint:/var/run/docker.sock Version:20.10.25 OSType:linux OS:NixOS 23.05 (Stoat) Arch:x86_64 CgroupVersion:2 CgroupDriver:systemd Filesystem:extfs InfoName:nixos} TRAC[0000] Listing Clusters... TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-k3s-default-serverlb TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-k3s-default-agent-1 TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-k3s-default-agent-0 TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-k3s-default-server-0 DEBU[0000] Found 4 nodes TRAC[0000] Found node k3d-k3s-default-serverlb of role loadbalancer TRAC[0000] Found node k3d-k3s-default-agent-1 of role agent TRAC[0000] Found node k3d-k3s-default-agent-0 of role agent TRAC[0000] Found node k3d-k3s-default-server-0 of role server TRAC[0000] Filteres 4 nodes by roles (in: [server agent loadbalancer] | ex: [registry]), got 4 left TRAC[0000] Found 4 cluster-internal nodes TRAC[0000] Found cluster-internal node k3d-k3s-default-serverlb of role loadbalancer belonging to cluster k3s-default TRAC[0000] Found cluster-internal node k3d-k3s-default-agent-1 of role agent belonging to cluster k3s-default TRAC[0000] Found cluster-internal node k3d-k3s-default-agent-0 of role agent belonging to cluster k3s-default TRAC[0000] Found cluster-internal node k3d-k3s-default-server-0 of role server belonging to cluster k3s-default DEBU[0000] Found 1 clusters NAME SERVERS AGENTS LOADBALANCER k3s-default 1/1 0/2 true ```

Environment:

Error Message:

INFO[0019] Starting Node 'k3d-k3s-default-server-0'

Workaround: No known workaround exists currently. Users are unable to create Kubernetes clusters using k3d on NixOS until this issue is resolved.

Logs - Creating Sample Cluster on NIXOS

``` k3d cluster create sample INFO[0000] Prep: Network INFO[0000] Created network 'k3d-sample' INFO[0000] Created image volume k3d-sample-images INFO[0000] Starting new tools node... INFO[0000] Starting Node 'k3d-sample-tools' INFO[0001] Creating node 'k3d-sample-server-0' INFO[0001] Creating LoadBalancer 'k3d-sample-serverlb' INFO[0001] Using the k3d-tools node to gather environment information INFO[0001] HostIP: using network gateway 172.28.0.1 address INFO[0001] Starting cluster 'sample' INFO[0001] Starting servers... INFO[0001] Starting Node 'k3d-sample-server-0' ^C ``` ``` [nix-shell:~/app]$ k3d cluster delete sample --verbose --trace DEBU[0000] DOCKER_SOCK=/var/run/docker.sock DEBU[0000] Runtime Info: &{Name:docker Endpoint:/var/run/docker.sock Version:20.10.25 OSType:linux OS:NixOS 23.05 (Stoat) Arch:x86_64 CgroupVersion:2 CgroupDriver:systemd Filesystem:extfs InfoName:nixos} DEBU[0000] Configuration: {} TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-sample-serverlb TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-sample-server-0 TRAC[0000] Reading path /etc/confd/values.yaml from node k3d-sample-serverlb... ERRO[0000] error getting loadbalancer config from k3d-sample-serverlb: runtime failed to read loadbalancer config '/etc/confd/values.yaml' from node 'k3d-sample-serverlb': Error response from daemon: Could not find the file /etc/confd/values.yaml in container 54f40f1d37d4f8e818979ab075f9ffe0007abc955019a8e157a73a2ba3aeba85: file not found INFO[0000] Deleting cluster 'sample' TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-sample-serverlb TRAC[0000] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-sample-server-0 DEBU[0000] Cluster Details: &{Name:sample Network:{Name:k3d-sample ID: External:false IPAM:{IPPrefix:invalid Prefix IPsUsed:[] Managed:false} Members:[]} Token:DXLYiwAZRCPUYDfwtOdy Nodes:[0xc00051fba0 0xc000017040] InitNode: ExternalDatastore: KubeAPI: ServerLoadBalancer:0xc000241920 ImageVolume:k3d-sample-images Volumes:[k3d-sample-images]} DEBU[0000] Deleting node k3d-sample-serverlb ... TRAC[0000] [Docker] Deleted Container k3d-sample-serverlb DEBU[0000] Deleting node k3d-sample-server-0 ... TRAC[0000] [Docker] Deleted Container k3d-sample-server-0 INFO[0000] Deleting cluster network 'k3d-sample' INFO[0000] Deleting 1 attached volumes... DEBU[0000] Deleting volume k3d-sample-images... INFO[0000] Removing cluster details from default kubeconfig... DEBU[0000] Using default kubeconfig 'kubeconfig' DEBU[0000] Wrote kubeconfig to 'kubeconfig' INFO[0000] Removing standalone kubeconfig file (if there is one)... INFO[0000] Successfully deleted cluster sample! ```

iwilltry42 commented 7 months ago

Hi @byteshiva thanks for opening this issue! Can you please provide some logs of the node containers? Specifically of the server container and the first agent container?

byteshiva commented 7 months ago

Hi @byteshiva thanks for opening this issue! Can you please provide some logs of the node containers? Specifically of the server container and the first agent container?

Logs Details

``` k3d cluster create sample --trace --verbose DEBU[0000] DOCKER_SOCK=/var/run/docker.sock DEBU[0000] Runtime Info: &{Name:docker Endpoint:/var/run/docker.sock Version:20.10.25 OSType:linux OS:NixOS 23.05 (Stoat) Arch:x86_64 CgroupVersion:2 CgroupDriver:systemd Filesystem:extfs InfoName:nixos} DEBU[0000] Additional CLI Configuration: cli: api-port: "" env: [] k3s-node-labels: [] k3sargs: [] ports: [] registries: create: "" runtime-labels: [] runtime-ulimits: [] volumes: [] hostaliases: [] DEBU[0000] Configuration: agents: 0 image: docker.io/rancher/k3s:v1.21.7-k3s1 network: "" options: k3d: disableimagevolume: false disableloadbalancer: false disablerollback: false loadbalancer: configoverrides: [] timeout: 0s wait: true kubeconfig: switchcurrentcontext: true updatedefaultkubeconfig: true runtime: agentsmemory: "" gpurequest: "" hostpidmode: false serversmemory: "" registries: config: "" use: [] servers: 1 subnet: "" token: "" TRAC[0000] Trying to read config apiVersion='k3d.io/v1alpha5', kind='simple' DEBU[0000] ========== Simple Config ========== {TypeMeta:{Kind:Simple APIVersion:k3d.io/v1alpha5} ObjectMeta:{Name:} Servers:1 Agents:0 ExposeAPI:{Host: HostIP: HostPort:} Image:docker.io/rancher/k3s:v1.21.7-k3s1 Network: Subnet: ClusterToken: Volumes:[] Ports:[] Options:{K3dOptions:{Wait:true Timeout:0s DisableLoadbalancer:false DisableImageVolume:false NoRollback:false NodeHookActions:[] Loadbalancer:{ConfigOverrides:[]}} K3sOptions:{ExtraArgs:[] NodeLabels:[]} KubeconfigOptions:{UpdateDefaultKubeconfig:true SwitchCurrentContext:true} Runtime:{GPURequest: ServersMemory: AgentsMemory: HostPidMode:false Labels:[] Ulimits:[]}} Env:[] Registries:{Use:[] Create: Config:} HostAliases:[]} ========================== TRAC[0000] VolumeFilterMap: map[] TRAC[0000] PortFilterMap: map[] TRAC[0000] K3sNodeLabelFilterMap: map[] TRAC[0000] RuntimeLabelFilterMap: map[] TRAC[0000] EnvFilterMap: map[] DEBU[0000] ========== Merged Simple Config ========== {TypeMeta:{Kind:Simple APIVersion:k3d.io/v1alpha5} ObjectMeta:{Name:} Servers:1 Agents:0 ExposeAPI:{Host: HostIP: HostPort:45731} Image:docker.io/rancher/k3s:v1.21.7-k3s1 Network: Subnet: ClusterToken: Volumes:[] Ports:[] Options:{K3dOptions:{Wait:true Timeout:0s DisableLoadbalancer:false DisableImageVolume:false NoRollback:false NodeHookActions:[] Loadbalancer:{ConfigOverrides:[]}} K3sOptions:{ExtraArgs:[] NodeLabels:[]} KubeconfigOptions:{UpdateDefaultKubeconfig:true SwitchCurrentContext:true} Runtime:{GPURequest: ServersMemory: AgentsMemory: HostPidMode:false Labels:[] Ulimits:[]}} Env:[] Registries:{Use:[] Create: Config:} HostAliases:[]} ========================== DEBU[0000] generated loadbalancer config: ports: 6443.tcp: - k3d-sample-server-0 settings: workerConnections: 1024 DEBU[0000] ===== Merged Cluster Config ===== &{TypeMeta:{Kind: APIVersion:} Cluster:{Name:sample Network:{Name:k3d-sample ID: External:false IPAM:{IPPrefix:invalid Prefix IPsUsed:[] Managed:false} Members:[]} Token: Nodes:[0xc000561380 0xc000561520] InitNode: ExternalDatastore: KubeAPI:0xc00036a2c0 ServerLoadBalancer:0xc0001c97c0 ImageVolume: Volumes:[]} ClusterCreateOpts:{DisableImageVolume:false WaitForServer:true Timeout:0s DisableLoadBalancer:false GPURequest: ServersMemory: AgentsMemory: NodeHooks:[] GlobalLabels:map[app:k3d] GlobalEnv:[] HostAliases:[] Registries:{Create: Use:[] Config:}} KubeconfigOpts:{UpdateDefaultKubeconfig:true SwitchCurrentContext:true}} ===== ===== ===== DEBU[0000] '--kubeconfig-update-default set: enabling wait-for-server INFO[0000] Prep: Network INFO[0000] Created network 'k3d-sample' INFO[0000] Created image volume k3d-sample-images TRAC[0000] Using Registries: [] TRAC[0000] ===== Creating Cluster ===== Runtime: {} Cluster: &{Name:sample Network:{Name:k3d-sample ID:bbff5dbea3d6de3d7e5f51efa4237fa43f2e12bbb3dfe4c8e4253ff36921cbc1 External:false IPAM:{IPPrefix:172.23.0.0/16 IPsUsed:[] Managed:false} Members:[]} Token: Nodes:[0xc000561380 0xc000561520] InitNode: ExternalDatastore: KubeAPI:0xc00036a2c0 ServerLoadBalancer:0xc0001c97c0 ImageVolume:k3d-sample-images Volumes:[k3d-sample-images]} ClusterCreatOpts: &{DisableImageVolume:false WaitForServer:true Timeout:0s DisableLoadBalancer:false GPURequest: ServersMemory: AgentsMemory: NodeHooks:[] GlobalLabels:map[app:k3d k3d.cluster.imageVolume:k3d-sample-images k3d.cluster.network:k3d-sample k3d.cluster.network.external:false k3d.cluster.network.id:bbff5dbea3d6de3d7e5f51efa4237fa43f2e12bbb3dfe4c8e4253ff36921cbc1 k3d.cluster.network.iprange:172.23.0.0/16] GlobalEnv:[] HostAliases:[] Registries:{Create: Use:[] Config:}} ============================ TRAC[0000] Docker Machine not specified via DOCKER_MACHINE_NAME env var TRAC[0000] [Docker] Not using docker-machine DEBU[0000] [Docker] DockerHost: '' (unix:///run/user/1000/docker.sock) INFO[0000] Starting new tools node... DEBU[0000] DOCKER_SOCK=/var/run/docker.sock DEBU[0000] DOCKER_SOCK=/var/run/docker.sock DEBU[0000] DOCKER_SOCK=/var/run/docker.sock DEBU[0000] Detected CgroupV2, enabling custom entrypoint (disable by setting K3D_FIX_CGROUPV2=false) TRAC[0000] Creating node from spec &{Name:k3d-sample-tools Role:noRole Image:ghcr.io/k3d-io/k3d-tools:5.6.0 Volumes:[k3d-sample-images:/k3d/images /var/run/docker.sock:/var/run/docker.sock] Env:[] Cmd:[] Args:[noop] Ports:map[] Restart:false Created: HostPidMode:false RuntimeLabels:map[app:k3d k3d.cluster:sample k3d.version:v5.6.0] RuntimeUlimits:[] K3sNodeLabels:map[] Networks:[k3d-sample] ExtraHosts:[host.k3d.internal:host-gateway] ServerOpts:{IsInit:false KubeAPI:} AgentOpts:{} GPURequest: Memory: State:{Running:false Status: Started:} IP:{IP:invalid IP Static:false} HookActions:[]} TRAC[0000] Creating docker container with translated config &{ContainerConfig:{Hostname:k3d-sample-tools Domainname: User: AttachStdin:false AttachStdout:false AttachStderr:false ExposedPorts:map[] Tty:false OpenStdin:false StdinOnce:false Env:[K3S_KUBECONFIG_OUTPUT=/output/kubeconfig.yaml] Cmd:[noop] Healthcheck: ArgsEscaped:false Image:ghcr.io/k3d-io/k3d-tools:5.6.0 Volumes:map[] WorkingDir: Entrypoint:[] NetworkDisabled:false MacAddress: OnBuild:[] Labels:map[app:k3d k3d.cluster:sample k3d.role:noRole k3d.version:v5.6.0] StopSignal: StopTimeout: Shell:[]} HostConfig:{Binds:[k3d-sample-images:/k3d/images /var/run/docker.sock:/var/run/docker.sock] ContainerIDFile: LogConfig:{Type: Config:map[]} NetworkMode:bridge PortBindings:map[] RestartPolicy:{Name: MaximumRetryCount:0} AutoRemove:false VolumeDriver: VolumesFrom:[] ConsoleSize:[0 0] Annotations:map[] CapAdd:[] CapDrop:[] CgroupnsMode: DNS:[] DNSOptions:[] DNSSearch:[] ExtraHosts:[host.k3d.internal:host-gateway] GroupAdd:[] IpcMode: Cgroup: Links:[] OomScoreAdj:0 PidMode: Privileged:true PublishAllPorts:false ReadonlyRootfs:false SecurityOpt:[] StorageOpt:map[] Tmpfs:map[/run: /var/run:] UTSMode: UsernsMode: ShmSize:0 Sysctls:map[] Runtime: Isolation: Resources:{CPUShares:0 Memory:0 NanoCPUs:0 CgroupParent: BlkioWeight:0 BlkioWeightDevice:[] BlkioDeviceReadBps:[] BlkioDeviceWriteBps:[] BlkioDeviceReadIOps:[] BlkioDeviceWriteIOps:[] CPUPeriod:0 CPUQuota:0 CPURealtimePeriod:0 CPURealtimeRuntime:0 CpusetCpus: CpusetMems: Devices:[] DeviceCgroupRules:[] DeviceRequests:[] KernelMemory:0 KernelMemoryTCP:0 MemoryReservation:0 MemorySwap:0 MemorySwappiness: OomKillDisable: PidsLimit: Ulimits:[] CPUCount:0 CPUPercent:0 IOMaximumIOps:0 IOMaximumBandwidth:0} Mounts:[] MaskedPaths:[] ReadonlyPaths:[] Init:0xc0005a7cfa} NetworkingConfig:{EndpointsConfig:map[k3d-sample:0xc000174000]}} DEBU[0000] Created container k3d-sample-tools (ID: a6fe16a3358b1a8cd217f589f5ffbe8c92dadb68fde0bf5dddddbc9b3ada9262) DEBU[0000] Node k3d-sample-tools Start Time: 2024-04-02 17:16:51.683801745 +0530 IST m=+0.100806097 TRAC[0000] Starting node 'k3d-sample-tools' INFO[0000] Starting Node 'k3d-sample-tools' DEBU[0000] Truncated 2024-04-02 11:46:51.845559278 +0000 UTC to 2024-04-02 11:46:51 +0000 UTC INFO[0001] Creating node 'k3d-sample-server-0' TRAC[0001] Creating node from spec &{Name:k3d-sample-server-0 Role:server Image:docker.io/rancher/k3s:v1.21.7-k3s1 Volumes:[k3d-sample-images:/k3d/images] Env:[K3S_TOKEN=YGDndCmCsuedoObOOEtU] Cmd:[] Args:[] Ports:map[] Restart:true Created: HostPidMode:false RuntimeLabels:map[app:k3d k3d.cluster:sample k3d.cluster.imageVolume:k3d-sample-images k3d.cluster.network:k3d-sample k3d.cluster.network.external:false k3d.cluster.network.id:bbff5dbea3d6de3d7e5f51efa4237fa43f2e12bbb3dfe4c8e4253ff36921cbc1 k3d.cluster.network.iprange:172.23.0.0/16 k3d.cluster.token:YGDndCmCsuedoObOOEtU k3d.cluster.url:https://k3d-sample-server-0:6443 k3d.server.loadbalancer:k3d-sample-serverlb] RuntimeUlimits:[] K3sNodeLabels:map[] Networks:[k3d-sample] ExtraHosts:[] ServerOpts:{IsInit:false KubeAPI:0xc00036a2c0} AgentOpts:{} GPURequest: Memory: State:{Running:false Status: Started:} IP:{IP:invalid IP Static:false} HookActions:[]} TRAC[0001] Creating docker container with translated config &{ContainerConfig:{Hostname:k3d-sample-server-0 Domainname: User: AttachStdin:false AttachStdout:false AttachStderr:false ExposedPorts:map[] Tty:false OpenStdin:false StdinOnce:false Env:[K3S_TOKEN=YGDndCmCsuedoObOOEtU K3S_KUBECONFIG_OUTPUT=/output/kubeconfig.yaml] Cmd:[server --tls-san 0.0.0.0 --tls-san k3d-sample-serverlb] Healthcheck: ArgsEscaped:false Image:docker.io/rancher/k3s:v1.21.7-k3s1 Volumes:map[] WorkingDir: Entrypoint:[/bin/k3d-entrypoint.sh] NetworkDisabled:false MacAddress: OnBuild:[] Labels:map[app:k3d k3d.cluster:sample k3d.cluster.imageVolume:k3d-sample-images k3d.cluster.network:k3d-sample k3d.cluster.network.external:false k3d.cluster.network.id:bbff5dbea3d6de3d7e5f51efa4237fa43f2e12bbb3dfe4c8e4253ff36921cbc1 k3d.cluster.network.iprange:172.23.0.0/16 k3d.cluster.token:YGDndCmCsuedoObOOEtU k3d.cluster.url:https://k3d-sample-server-0:6443 k3d.role:server k3d.server.api.host:0.0.0.0 k3d.server.api.hostIP:0.0.0.0 k3d.server.api.port:45731 k3d.server.loadbalancer:k3d-sample-serverlb k3d.version:v5.6.0] StopSignal: StopTimeout: Shell:[]} HostConfig:{Binds:[k3d-sample-images:/k3d/images] ContainerIDFile: LogConfig:{Type: Config:map[]} NetworkMode:bridge PortBindings:map[] RestartPolicy:{Name:unless-stopped MaximumRetryCount:0} AutoRemove:false VolumeDriver: VolumesFrom:[] ConsoleSize:[0 0] Annotations:map[] CapAdd:[] CapDrop:[] CgroupnsMode: DNS:[] DNSOptions:[] DNSSearch:[] ExtraHosts:[] GroupAdd:[] IpcMode: Cgroup: Links:[] OomScoreAdj:0 PidMode: Privileged:true PublishAllPorts:false ReadonlyRootfs:false SecurityOpt:[] StorageOpt:map[] Tmpfs:map[/run: /var/run:] UTSMode: UsernsMode: ShmSize:0 Sysctls:map[] Runtime: Isolation: Resources:{CPUShares:0 Memory:0 NanoCPUs:0 CgroupParent: BlkioWeight:0 BlkioWeightDevice:[] BlkioDeviceReadBps:[] BlkioDeviceWriteBps:[] BlkioDeviceReadIOps:[] BlkioDeviceWriteIOps:[] CPUPeriod:0 CPUQuota:0 CPURealtimePeriod:0 CPURealtimeRuntime:0 CpusetCpus: CpusetMems: Devices:[] DeviceCgroupRules:[] DeviceRequests:[] KernelMemory:0 KernelMemoryTCP:0 MemoryReservation:0 MemorySwap:0 MemorySwappiness: OomKillDisable: PidsLimit: Ulimits:[] CPUCount:0 CPUPercent:0 IOMaximumIOps:0 IOMaximumBandwidth:0} Mounts:[] MaskedPaths:[] ReadonlyPaths:[] Init:0xc00031febf} NetworkingConfig:{EndpointsConfig:map[k3d-sample:0xc000312180]}} DEBU[0001] Created container k3d-sample-server-0 (ID: 3cd62d8068f89ae6ca5e27875a53c5b72b8aa7958a67f32f69655091075e5c24) DEBU[0001] Created node 'k3d-sample-server-0' INFO[0001] Creating LoadBalancer 'k3d-sample-serverlb' TRAC[0001] Creating node from spec &{Name:k3d-sample-serverlb Role:loadbalancer Image:ghcr.io/k3d-io/k3d-proxy:5.6.0 Volumes:[k3d-sample-images:/k3d/images] Env:[] Cmd:[] Args:[] Ports:map[6443:[{HostIP:0.0.0.0 HostPort:45731}]] Restart:true Created: HostPidMode:false RuntimeLabels:map[app:k3d k3d.cluster:sample k3d.cluster.imageVolume:k3d-sample-images k3d.cluster.network:k3d-sample k3d.cluster.network.external:false k3d.cluster.network.id:bbff5dbea3d6de3d7e5f51efa4237fa43f2e12bbb3dfe4c8e4253ff36921cbc1 k3d.cluster.network.iprange:172.23.0.0/16 k3d.cluster.token:YGDndCmCsuedoObOOEtU k3d.cluster.url:https://k3d-sample-server-0:6443 k3d.role:loadbalancer k3d.server.loadbalancer:k3d-sample-serverlb k3d.version:v5.6.0] RuntimeUlimits:[] K3sNodeLabels:map[] Networks:[k3d-sample] ExtraHosts:[] ServerOpts:{IsInit:false KubeAPI:} AgentOpts:{} GPURequest: Memory: State:{Running:false Status: Started:} IP:{IP:invalid IP Static:false} HookActions:[{Stage:preStart Action:{Runtime:{} Content:[112 111 114 116 115 58 10 32 32 54 52 52 51 46 116 99 112 58 10 32 32 45 32 107 51 100 45 115 97 109 112 108 101 45 115 101 114 118 101 114 45 48 10 115 101 116 116 105 110 103 115 58 10 32 32 119 111 114 107 101 114 67 111 110 110 101 99 116 105 111 110 115 58 32 49 48 50 52 10] Dest:/etc/confd/values.yaml Mode:-rwxr--r-- Description:Write Loadbalancer Configuration}}]} TRAC[0001] Creating docker container with translated config &{ContainerConfig:{Hostname:k3d-sample-serverlb Domainname: User: AttachStdin:false AttachStdout:false AttachStderr:false ExposedPorts:map[6443:{}] Tty:false OpenStdin:false StdinOnce:false Env:[K3S_KUBECONFIG_OUTPUT=/output/kubeconfig.yaml] Cmd:[] Healthcheck: ArgsEscaped:false Image:ghcr.io/k3d-io/k3d-proxy:5.6.0 Volumes:map[] WorkingDir: Entrypoint:[] NetworkDisabled:false MacAddress: OnBuild:[] Labels:map[app:k3d k3d.cluster:sample k3d.cluster.imageVolume:k3d-sample-images k3d.cluster.network:k3d-sample k3d.cluster.network.external:false k3d.cluster.network.id:bbff5dbea3d6de3d7e5f51efa4237fa43f2e12bbb3dfe4c8e4253ff36921cbc1 k3d.cluster.network.iprange:172.23.0.0/16 k3d.cluster.token:YGDndCmCsuedoObOOEtU k3d.cluster.url:https://k3d-sample-server-0:6443 k3d.role:loadbalancer k3d.server.loadbalancer:k3d-sample-serverlb k3d.version:v5.6.0] StopSignal: StopTimeout: Shell:[]} HostConfig:{Binds:[k3d-sample-images:/k3d/images] ContainerIDFile: LogConfig:{Type: Config:map[]} NetworkMode:bridge PortBindings:map[6443:[{HostIP:0.0.0.0 HostPort:45731}]] RestartPolicy:{Name:unless-stopped MaximumRetryCount:0} AutoRemove:false VolumeDriver: VolumesFrom:[] ConsoleSize:[0 0] Annotations:map[] CapAdd:[] CapDrop:[] CgroupnsMode: DNS:[] DNSOptions:[] DNSSearch:[] ExtraHosts:[] GroupAdd:[] IpcMode: Cgroup: Links:[] OomScoreAdj:0 PidMode: Privileged:true PublishAllPorts:false ReadonlyRootfs:false SecurityOpt:[] StorageOpt:map[] Tmpfs:map[/run: /var/run:] UTSMode: UsernsMode: ShmSize:0 Sysctls:map[] Runtime: Isolation: Resources:{CPUShares:0 Memory:0 NanoCPUs:0 CgroupParent: BlkioWeight:0 BlkioWeightDevice:[] BlkioDeviceReadBps:[] BlkioDeviceWriteBps:[] BlkioDeviceReadIOps:[] BlkioDeviceWriteIOps:[] CPUPeriod:0 CPUQuota:0 CPURealtimePeriod:0 CPURealtimeRuntime:0 CpusetCpus: CpusetMems: Devices:[] DeviceCgroupRules:[] DeviceRequests:[] KernelMemory:0 KernelMemoryTCP:0 MemoryReservation:0 MemorySwap:0 MemorySwappiness: OomKillDisable: PidsLimit: Ulimits:[] CPUCount:0 CPUPercent:0 IOMaximumIOps:0 IOMaximumBandwidth:0} Mounts:[] MaskedPaths:[] ReadonlyPaths:[] Init:0xc00022b44a} NetworkingConfig:{EndpointsConfig:map[k3d-sample:0xc0001f00c0]}} DEBU[0001] Created container k3d-sample-serverlb (ID: 85c710f6ac4b0ff7282a1aba32a8d786173914cb66f5ec5853e971bdf8f91e74) DEBU[0001] Created loadbalancer 'k3d-sample-serverlb' DEBU[0001] DOCKER_SOCK=/var/run/docker.sock INFO[0001] Using the k3d-tools node to gather environment information TRAC[0001] TranslateContainerDetailsToNode: Checking for default object label app=k3d on container /k3d-sample-tools DEBU[0001] no netlabel present on container /k3d-sample-tools DEBU[0001] failed to get IP for container /k3d-sample-tools as we couldn't find the cluster network DEBU[0001] Deleting node k3d-sample-tools ... TRAC[0001] [Docker] Deleted Container k3d-sample-tools DEBU[0001] DOCKER_SOCK=/var/run/docker.sock TRAC[0001] GOOS: linux / Runtime OS: linux (NixOS 23.05 (Stoat)) INFO[0001] HostIP: using network gateway 172.23.0.1 address INFO[0001] Starting cluster 'sample' INFO[0001] Starting servers... DEBU[0001] >>> enabling cgroupsv2 magic DEBU[0001] Node k3d-sample-server-0 Start Time: 2024-04-02 17:16:52.813886855 +0530 IST m=+1.230891207 TRAC[0001] Node k3d-sample-server-0: Executing preStartAction 'WriteFileAction': [WriteFileAction] Writing 904 bytes to /bin/k3d-entrypoint.sh (mode -rwxr--r--): Write custom k3d entrypoint script (that powers the magic fixes) TRAC[0001] Node k3d-sample-server-0: Executing preStartAction 'WriteFileAction': [WriteFileAction] Writing 1325 bytes to /bin/k3d-entrypoint-cgroupv2.sh (mode -rwxr--r--): Write entrypoint script for CGroupV2 fix TRAC[0001] Starting node 'k3d-sample-server-0' INFO[0001] Starting Node 'k3d-sample-server-0' DEBU[0001] Truncated 2024-04-02 11:46:53.001387914 +0000 UTC to 2024-04-02 11:46:53 +0000 UTC DEBU[0001] Waiting for node k3d-sample-server-0 to get ready (Log: 'k3s is up and running') TRAC[0001] NodeWaitForLogMessage: Node 'k3d-sample-server-0' waiting for log message 'k3s is up and running' since '2024-04-02 11:46:53 +0000 UTC' ```

docker ps 
CONTAINER ID   IMAGE                      COMMAND                  CREATED              STATUS              PORTS     NAMES
3cd62d8068f8   rancher/k3s:v1.21.7-k3s1   "/bin/k3d-entrypoint…"   About a minute ago   Up About a minute             k3d-sample-server-0
Details - docker logs 3cd62d8068f8

``` docker logs 3cd62d8068f8 time="2024-04-02T11:46:53.250125259Z" level=info msg="Starting k3s v1.21.7+k3s1 (ac705709)" time="2024-04-02T11:46:53.253900045Z" level=info msg="Configuring sqlite3 database connection pooling: maxIdleConns=2, maxOpenConns=0, connMaxLifetime=0s" time="2024-04-02T11:46:53.253924145Z" level=info msg="Configuring database table schema and indexes, this may take a moment..." time="2024-04-02T11:46:53.255558150Z" level=info msg="Database tables and indexes are up to date" time="2024-04-02T11:46:53.256688498Z" level=info msg="Kine listening on unix://kine.sock" The connection to the server localhost:8080 was refused - did you specify the right host or port? time="2024-04-02T11:46:53.264174013Z" level=info msg="certificate CN=system:admin,O=system:masters signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.264479924Z" level=info msg="certificate CN=system:kube-controller-manager signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.264770166Z" level=info msg="certificate CN=system:kube-scheduler signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.265063088Z" level=info msg="certificate CN=kube-apiserver signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.265352520Z" level=info msg="certificate CN=system:kube-proxy signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.265613003Z" level=info msg="certificate CN=system:k3s-controller signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.265907285Z" level=info msg="certificate CN=k3s-cloud-controller-manager signed by CN=k3s-client-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.266424011Z" level=info msg="certificate CN=kube-apiserver signed by CN=k3s-server-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.266917067Z" level=info msg="certificate CN=system:auth-proxy signed by CN=k3s-request-header-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.267510871Z" level=info msg="certificate CN=etcd-server signed by CN=etcd-server-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.267778203Z" level=info msg="certificate CN=etcd-client signed by CN=etcd-server-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.268244810Z" level=info msg="certificate CN=etcd-peer signed by CN=etcd-peer-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.323266356Z" level=info msg="certificate CN=k3s,O=k3s signed by CN=k3s-server-ca@1712058413: notBefore=2024-04-02 11:46:53 +0000 UTC notAfter=2025-04-02 11:46:53 +0000 UTC" time="2024-04-02T11:46:53.323463981Z" level=info msg="Active TLS secret (ver=) (count 11): map[listener.cattle.io/cn-0.0.0.0:0.0.0.0 listener.cattle.io/cn-10.43.0.1:10.43.0.1 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-172.23.0.2:172.23.0.2 listener.cattle.io/cn-k3d-sample-server-0:k3d-sample-server-0 listener.cattle.io/cn-k3d-sample-serverlb:k3d-sample-serverlb listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc:kubernetes.default.svc listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/fingerprint:SHA1=2EB54B637735D75364D47C54A1661D45ECC9DD8D]" time="2024-04-02T11:46:53.326516247Z" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=https://kubernetes.default.svc.cluster.local,k3s --authorization-mode=Node,RBAC --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --enable-admission-plugins=NodeRestriction --etcd-servers=unix://kine.sock --insecure-port=0 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --profiling=false --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.43.0.0/16 --service-node-port-range=30000-32767 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key" Flag --insecure-port has been deprecated, This flag has no effect now and will be removed in v1.24. I0402 11:46:53.327562 23 server.go:656] external host was not specified, using 172.23.0.2 I0402 11:46:53.327676 23 server.go:195] Version: v1.21.7+k3s1 time="2024-04-02T11:46:53.328635248Z" level=info msg="Running kube-scheduler --address=127.0.0.1 --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --leader-elect=false --port=10251 --profiling=false --secure-port=0" time="2024-04-02T11:46:53.328699297Z" level=info msg="Waiting for API server to become available" time="2024-04-02T11:46:53.328894831Z" level=info msg="Running kube-controller-manager --address=127.0.0.1 --allocate-node-cidrs=true --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-kube-apiserver-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-kube-apiserver-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-kubelet-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-serving-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --cluster-signing-kubelet-serving-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --cluster-signing-legacy-unknown-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-legacy-unknown-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --configure-cloud-routes=false --controllers=*,-service,-route,-cloud-node-lifecycle --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --leader-elect=false --port=10252 --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=0 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.key --use-service-account-credentials=true" time="2024-04-02T11:46:53.329155104Z" level=info msg="Running cloud-controller-manager --allocate-node-cidrs=true --bind-address=127.0.0.1 --cloud-provider=k3s --cluster-cidr=10.42.0.0/16 --configure-cloud-routes=false --kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --leader-elect=false --node-status-update-frequency=1m0s --port=0 --profiling=false" time="2024-04-02T11:46:53.329498625Z" level=info msg="Node token is available at /var/lib/rancher/k3s/server/token" time="2024-04-02T11:46:53.329532594Z" level=info msg="To join node to cluster: k3s agent -s https://172.23.0.2:6443 -t ${NODE_TOKEN}" time="2024-04-02T11:46:53.330009301Z" level=info msg="Wrote kubeconfig /output/kubeconfig.yaml" time="2024-04-02T11:46:53.330103808Z" level=info msg="Run: k3s kubectl" time="2024-04-02T11:46:53.330391070Z" level=fatal msg="failed to find cpuset cgroup (v2)" The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? ```

Reference:

  1. https://nixos.wiki/wiki/K3s
iwilltry42 commented 7 months ago

Reference:

1. [nixos.wiki/wiki/K3s](https://nixos.wiki/wiki/K3s)

Since you reference the nixos Wiki - did you try https://nixos.wiki/wiki/K3s#Raspberry_Pi_not_working which corresponds to the fatal log of the server container?

byteshiva commented 7 months ago

Reference:

1. [nixos.wiki/wiki/K3s](https://nixos.wiki/wiki/K3s)

Since you reference the nixos Wiki - did you try https://nixos.wiki/wiki/K3s#Raspberry_Pi_not_working which corresponds to the fatal log of the server container?

Despite configuring the system with the appropriate kernel parameters and setting up the Nix shell as per the provided script, I'm consistently encountering connection refusal errors when attempting to connect to the server.

Steps to Reproduce:

Details

1. Created a new Nix shell using the provided script `run.sh`. ```bash cat run.sh export NIXPKGS_ALLOW_UNFREE=1 nix-shell -E ' let nixpkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/nixos-unstable.tar.gz") {}; in nixpkgs.mkShell { buildInputs = with nixpkgs; [ k3d k3s docker containerd runc ]; shellHook = "export KUBECONFIG=kubeconfig"; }' ``` 2. Applied necessary kernel parameters in NixOS configuration: ```nix boot.kernelParams = [ "cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory" ]; ``` 3. Despite the above configurations, attempting to connect to the K3s server results in the following error: ``` time="2024-04-02T12:11:02.735710245Z" level=info msg="Node token is available at /var/lib/rancher/k3s/server/token" time="2024-04-02T12:11:02.735733706Z" level=info msg="To join node to cluster: k3s agent -s https://172.23.0.2:6443 -t ${NODE_TOKEN}" time="2024-04-02T12:11:02.736267310Z" level=info msg="Wrote kubeconfig /output/kubeconfig.yaml" time="2024-04-02T12:11:02.736346042Z" level=info msg="Run: k3s kubectl" time="2024-04-02T12:11:02.736397044Z" level=fatal msg="failed to find cpuset cgroup (v2)" The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? ``` ``` cat /etc/nixos/configuration.nix networking.firewall.allowedTCPPorts = [6443]; ``` ``` cat /etc/nixos/hardware-configuration.nix boot.kernelParams = [ "cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory" ]; ```