[BUG] k3d 5.4.7 error overwriting contents of /etc/hosts

wenerme commented 1 year ago

What did you do

k3d cluster create dev \
  -v /data/kube/data:/data \
  -p 18080:80 \
  -p 18443:443 \
  -v /data/kube/storage:/var/lib/rancher/k3s/storage@all \
  --api-port 6443 \
  --registry-create dev-registry --trace

What did you expect to happen

without error

Screenshots or terminal output

TRAC[0019] Rewritten:
::1 ip6-localhost ip6-loopback localhost
127.0.0.1 localhost
172.18.0.1 host.k3d.internal
172.18.0.2 k3d-dev-server-0
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
DEBU[0020] Executing command '[sh -c cat /tmp/-etc-hosts-cUAunADhzSQlEbdflLOb > /etc/hosts]' in node 'k3d-dev-server-0'
TRAC[0020] Exec process '[sh -c cat /tmp/-etc-hosts-cUAunADhzSQlEbdflLOb > /etc/hosts]' still running in node 'k3d-dev-server-0'.. sleeping for 1 second...
ERRO[0021] Failed Cluster Start: error during post-start cluster preparation: error overwriting contents of /etc/hosts: Exec process in node 'k3d-dev-server-0' failed with exit code '139': Logs from failed access process:
ERRO[0021] Failed to create cluster >>> Rolling Back

Which OS & Architecture

arch: aarch64
cgroupdriver: cgroupfs
cgroupversion: "1"
endpoint: /var/run/docker.sock
filesystem: extfs
name: docker
os: CentOS Linux 7 (AltArch)
ostype: linux
version: 20.10.18

Which version of `k3d`

k3d version v5.4.7
k3s version v1.25.6-k3s1 (default)

Which version of docker

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)

Server:
 Containers: 5
  Running: 0
  Paused: 0
  Stopped: 5
 Images: 6
 Server Version: 20.10.18
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-348.20.1.el7.aarch64
 Operating System: CentOS Linux 7 (AltArch)
 OSType: linux
 Architecture: aarch64
 CPUs: 16
 Total Memory: 31.18GiB
 Docker Root Dir: /data/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Client: Docker Engine - Community
 Version:           20.10.18
 API version:       1.41
 Go version:        go1.18.6
 Git commit:        b40c2f6
 Built:             Thu Sep  8 23:11:43 2022
 OS/Arch:           linux/arm64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.18
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.6
  Git commit:       e42327a
  Built:            Thu Sep  8 23:10:24 2022
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.8
  GitCommit:        9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

wenerme commented 1 year ago

Tried k3d-5.4.6, works.

TRAC[0052] Rewritten:
::1 ip6-localhost ip6-loopback localhost
127.0.0.1 localhost
172.18.0.1 host.k3d.internal
172.18.0.2 k3d-dev-server-0
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
DEBU[0054] Executing command '[sh -c cat /tmp/-etc-hosts-kBbRYSVBRwHBUrnoCdEe > /etc/hosts]' in node 'k3d-dev-server-0'
TRAC[0054] Exec process '[sh -c cat /tmp/-etc-hosts-kBbRYSVBRwHBUrnoCdEe > /etc/hosts]' still running in node 'k3d-dev-server-0'.. sleeping for 1 second...
DEBU[0055] Exec process in node 'k3d-dev-server-0' exited with '0'
DEBU[0055] Executing command '[sh -c kubectl apply -f /tmp/localRegistryHostingCM.yaml]' in node 'k3d-dev-server-0'
TRAC[0055] Exec process '[sh -c kubectl apply -f /tmp/localRegistryHostingCM.yaml]' still running in node 'k3d-dev-server-0'.. sleeping for 1 second...
DEBU[0056] Exec process in node 'k3d-dev-server-0' exited with '0'
INFO[0056] Cluster 'dev' created successfully!

audacioustux commented 1 year ago

persists in v5.4.9

sheremet commented 1 year ago

Reproduced in v5.6.0. I can't run custom image for NVIDIA GPU support with this error. Error looks like:

ERRO[0018] Failed Cluster Start: error during post-start cluster preparation: error overwriting contents of /etc/hosts: Exec process in node 'k3d-dev-cluster-agent-1' failed with exit code '126': Logs from failed access process:
 CI runtime exec failed: exec failed: unable to start container process: exec /usr/bin/sh: no such file or directory: unknown
ERRO[0018] Failed to create cluster >>> Rolling Back

haiminh2001 commented 8 months ago

Hi all, I'm encountring this issue, with the k3s image built following this guide: https://k3d.io/v5.3.0/usage/advanced/cuda/#build-the-k3s-image. Do you guys have any solution or walk-around yet ?

haiminh2001 commented 8 months ago

Hi all, I'm encountring this issue, with the k3s image built following this guide: https://k3d.io/v5.3.0/usage/advanced/cuda/#build-the-k3s-image. Do you guys have any solution or walk-around yet ?

Update: I fixed it myself by adding sh to /usr/bin in the gpu custom image.

marekberith commented 6 months ago

Hi all, I'm encountring this issue, with the k3s image built following this guide: https://k3d.io/v5.3.0/usage/advanced/cuda/#build-the-k3s-image. Do you guys have any solution or walk-around yet ?

Update: I fixed it myself by adding sh to /usr/bin in the gpu custom image.

Hi @haiminh2001 how did you do that? Thanks :)

haiminh2001 commented 6 months ago

Hi all, I'm encountring this issue, with the k3s image built following this guide: https://k3d.io/v5.3.0/usage/advanced/cuda/#build-the-k3s-image. Do you guys have any solution or walk-around yet ?

Update: I fixed it myself by adding sh to /usr/bin in the gpu custom image.

Hi @haiminh2001 how did you do that? Thanks :)

Hi @marekberith, to clarify, at the moment I think it was my mistake :)). The k3s image, if I remember it right, is based on ubuntu 18.04, but I was building the image based on cuda 12.1.0 - ubuntu20.04. The /bin folder are not the default path to be looked up for. You may refer this link. I fixed it by dumping all the things in the /bin folder to the /usr/bin folder. If you do not need ubuntu20.04, you may just downgrade to the ubuntu18.04 (I assume you are using ubuntu20.04 too because you are encountering this error).

k3d-io / k3d