k3d-io / k3d

Little helper to run CNCF's k3s in Docker
https://k3d.io/
MIT License
5.49k stars 463 forks source link

[BUG] k3d create SIGENV fail using `--agents=2` with Calico #384

Closed blaisep-vgs closed 4 years ago

blaisep-vgs commented 4 years ago

What did you do

goroutine 1 [running]: github.com/rancher/k3d/v3/pkg/runtimes/docker.Docker.ExecInNodeGetLogs(0x4d155e0, 0xc00003c190, 0xc0003d5e00, 0xc000ea34a0, 0x3, 0x3, 0x4a64240, 0xc000c15c01, 0xc000ea34a0) github.com/rancher/k3d/v3/pkg/runtimes/docker/node.go:311 +0x79 github.com/rancher/k3d/v3/pkg/cluster.resolveHostnameFromInside(0x4d155e0, 0xc00003c190, 0x4d25380, 0x52ff700, 0xc0003d5e00, 0x4bde2f7, 0x14, 0x41f9305, 0xc0000bc040, 0x4bc1040, ...) github.com/rancher/k3d/v3/pkg/cluster/host.go:79 +0x17a github.com/rancher/k3d/v3/pkg/cluster.GetHostIP(0x4d155e0, 0xc00003c190, 0x4d25380, 0x52ff700, 0xc000428900, 0x0, 0x0, 0x0, 0x0, 0x0) github.com/rancher/k3d/v3/pkg/cluster/host.go:60 +0x17e github.com/rancher/k3d/v3/pkg/cluster.ClusterCreate(0x4d155e0, 0xc00003c190, 0x4d25380, 0x52ff700, 0xc000428900, 0x0, 0x0) github.com/rancher/k3d/v3/pkg/cluster/cluster.go:360 +0x1935 github.com/rancher/k3d/v3/cmd/cluster.NewCmdClusterCreate.func1(0xc000414b00, 0xc0003d8b00, 0x0, 0x8) github.com/rancher/k3d/v3/cmd/cluster/clusterCreate.go:83 +0x14c github.com/spf13/cobra.(Command).execute(0xc000414b00, 0xc0003d8a80, 0x8, 0x8, 0xc000414b00, 0xc0003d8a80) github.com/spf13/cobra@v1.0.1-0.20200629195214-2c5a0d300f8b/command.go:846 +0x2c2 github.com/spf13/cobra.(Command).ExecuteC(0x52bfd40, 0xc0000320c0, 0xa, 0xa) github.com/spf13/cobra@v1.0.1-0.20200629195214-2c5a0d300f8b/command.go:950 +0x375 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.0.1-0.20200629195214-2c5a0d300f8b/command.go:887 github.com/rancher/k3d/v3/cmd.Execute() github.com/rancher/k3d/v3/cmd/root.go:91 +0x5a main.main() github.com/rancher/k3d/v3/main.go:27 +0x25

- What did you do afterwards?

I confirmed that it would work with no agents created and then I discussed this with @iwilltry42 and he suggested that I open this issue.

## What did you expect to happen

I expected:

This works (builds server-0 and serverlb containers, with Calico and Traefik 1.8)

$ k3d cluster create --api-port 6550 -p "8081:80@loadbalancer" --k3s-server-arg '--flannel-backend=none' --volume "$(pwd):/var/lib/rancher/k3s/server/manifests" WARN[0000] No node filter specified INFO[0000] Created network 'k3d-k3s-default' INFO[0000] Created volume 'k3d-k3s-default-images' INFO[0001] Creating node 'k3d-k3s-default-server-0' INFO[0012] Creating LoadBalancer 'k3d-k3s-default-serverlb' INFO[0013] (Optional) Trying to get IP of the docker host and inject it into the cluster as 'host.k3d.internal' for easy access INFO[0017] Successfully added host record to /etc/hosts in 2/2 nodes and to the CoreDNS ConfigMap INFO[0017] Cluster 'k3s-default' created successfully!



## Screenshots or terminal output

see above...

## Which OS & Architecture

Darwin bpabon 19.6.0 Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64 x86_64

## Which version of `k3d`

k3d version v3.1.3
k3s version latest (default)

## Which version of docker

Client:
 Debug Mode: false

Server:
 Containers: 7
  Running: 2
  Paused: 0
  Stopped: 5
 Images: 384
 Server Version: 19.03.13
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.76-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 5.811GiB
 Name: docker-desktop
 ID: P6EF:ED7L:DZQH:2D33:KVU7:VV5B:ZZI2:HEAV:FES3:YLKQ:7H3A:CZJJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 56
  Goroutines: 59
  System Time: 2020-10-24T18:25:24.675013Z
  EventsListeners: 3
 HTTP Proxy: gateway.docker.internal:3128
 HTTPS Proxy: gateway.docker.internal:3129
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: true
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
iwilltry42 commented 4 years ago

Hi @blaisep-vgs , thanks for opening this issue! :) Follow-Up question: what do you have at $(pwd) that you mount to the manifests folder? Since you're mounting a whole folder, you're overwriting everything that k3s usually deploys (like CoreDNS, etc.). You may want to mount to a subfolder instead, e.g. --volume "$(pwd):/var/lib/rancher/k3s/server/manifests/mystuff".

iwilltry42 commented 4 years ago

Also, for debugging purposes, can you please do a docker ps -a and a docker logs k3d-k3s-default-server-0 and post the output here?

iwilltry42 commented 4 years ago

The nil pointer exception will probably already be fixed by b5eeda7 .. can you build from main branch and try again? If you want, I can also build it for you and link you to the executable for download.

blaisep-vgs commented 4 years ago

Hi @iwilltry42 , thanks for the suggestions, I will build from main and follow the new instructions re logs and volume mounts.

blaisep-vgs commented 4 years ago

@iwilltry42 I think you can close this issue now. I built from main (b5eeda74d61788328e7ac513078d2ada4644f6d0) and fully qualified the volume mount. It works now, using the instructions in the docs. :

 $ k3d cluster create --agents 2 --api-port 6550 -p "8081:80@loadbalancer" --k3s-server-arg '--flannel-backend=none' --volume "$(pwd)/docs/usage/guides/calico.yaml:/var/lib/rancher/k3s/server/manifests/calico.yaml"

WARN[0000] No node filter specified
INFO[0000] Created network 'k3d-k3s-default'
INFO[0000] Created volume 'k3d-k3s-default-images'
INFO[0001] Creating node 'k3d-k3s-default-server-0'
INFO[0002] Pulling image 'docker.io/rancher/k3s:v1.18.9-k3s1'
INFO[0022] Creating node 'k3d-k3s-default-agent-0'
INFO[0022] Creating node 'k3d-k3s-default-agent-1'
INFO[0022] Creating LoadBalancer 'k3d-k3s-default-serverlb'
INFO[0024] Pulling image 'docker.io/rancher/k3d-proxy:v3.1.5'
INFO[0027] (Optional) Trying to get IP of the docker host and inject it into the cluster as 'host.k3d.internal' for easy access
INFO[0034] Successfully added host record to /etc/hosts in 4/4 nodes and to the CoreDNS ConfigMap
INFO[0034] Cluster 'k3s-default' created successfully!
iwilltry42 commented 4 years ago

That's great 🙂 I guess it also works without the custom build? Still nice that you found that edge case we could fix now 👍

blaisep-vgs commented 4 years ago

@iwilltry42 , yes, it also works with the standard build using install.sh I tried both methods.