k3d-io / k3d

Little helper to run CNCF's k3s in Docker
https://k3d.io/
MIT License
5.45k stars 461 forks source link

[BUG] host.k3d.internal not injected #926

Open jeusdi opened 2 years ago

jeusdi commented 2 years ago

What did you do

I've set a cluster up using this config file:

command: k3d cluster create --config k3d-config.yaml

apiVersion: k3d.io/v1alpha3
kind: Simple
name: salut
servers: 1
agents: 2
ports:
  - port: 8000:32080
    nodeFilters:
      - server:0:direct
  - port: 8443:32443
    nodeFilters:
      - server:0:direct
  - port: 9000:32090
    nodeFilters:
      - server:0:direct
  - port: 20017:30017
    nodeFilters:
      - server:0:direct
  - port: 20018:30018
    nodeFilters:
      - server:0:direct
registries:
  config: |
    mirrors:
      "registry.localhost:5000":
        endpoint:
          - http://host.k3d.internal:5000
options:
  k3d:
    wait: true
    timeout: "60s"
    disableLoadbalancer: true
    disableImageVolume: false
    disableRollback: true
  k3s:
    extraArgs:
      - arg: '--disable=traefik,servicelb'
        nodeFilters:
          - server:*
  kubeconfig:
    updateDefaultKubeconfig: true
    switchCurrentContext: true

I'm getting this event when I'm trying to deploy my application:

Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  15s               default-scheduler  Successfully assigned salut/mongo-fhir-5657bbcccd-v785d to k3d-salut-agent-0
  Normal   BackOff    14s               kubelet            Back-off pulling image "registry.localhost:5000/fhir-mongo:463d12e8-dirty@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166"
  Warning  Failed     14s               kubelet            Error: ImagePullBackOff
  Normal   Pulling    3s (x2 over 14s)  kubelet            Pulling image "registry.localhost:5000/fhir-mongo:463d12e8-dirty@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166"
  Warning  Failed     3s (x2 over 14s)  kubelet            Failed to pull image "registry.localhost:5000/fhir-mongo:463d12e8-dirty@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166": rpc error: code = Unknown desc = failed to pull and unpack image "registry.localhost:5000/fhir-mongo@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166": failed to resolve reference "registry.localhost:5000/fhir-mongo@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166": failed to do request: Head "http://host.k3d.internal:5000/v2/fhir-mongo/manifests/sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166?ns=registry.localhost%3A5000": dial tcp: lookup host.k3d.internal: no such host
  Warning  Failed     3s (x2 over 14s)  kubelet            Error: ErrImagePull

Formatted warning failed event:

Failed to pull image "registry.localhost:5000/fhir-mongo:463d12e8-dirty@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166"
: rpc error: code = Unknown desc = failed to pull and unpack image "registry.localhost:5000/fhir-mongo@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166"
: failed to resolve reference "registry.localhost:5000/fhir-mongo@sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166"
: failed to do request: Head "http://host.k3d.internal:5000/v2/fhir-mongo/manifests/sha256:b90c9c5802013765f9718f728dc0ba7f218564c1bf97a74e86c3b07fa6478166?ns=registry.localhost%3A5000"
: dial tcp: lookup host.k3d.internal: no such host

Last one says:

: dial tcp: lookup host.k3d.internal: no such host

After that, I've took a look on coredns configmap:

$ kubectl describe configmap coredns
Name:         coredns
Namespace:    kube-system
Labels:       objectset.rio.cattle.io/hash=bce283298811743a0386ab510f2f67ef74240c57
Annotations:  objectset.rio.cattle.io/applied:
                {"apiVersion":"v1","data":{"Corefile":".:53 {\n    errors\n    health\n    ready\n    kubernetes cluster.local in-addr.arpa ip6.arpa {\n  ...
              objectset.rio.cattle.io/id: 
              objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
              objectset.rio.cattle.io/owner-name: coredns
              objectset.rio.cattle.io/owner-namespace: kube-system

Data
====
Corefile:
----
.:53 {
    errors
    health
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
      pods insecure
      fallthrough in-addr.arpa ip6.arpa
    }
    hosts /etc/coredns/NodeHosts {
      ttl 60
      reload 15s
      fallthrough
    }
    prometheus :9153
    forward . /etc/resolv.conf
    cache 30
    loop
    reload
    loadbalance
}

NodeHosts:
----
172.18.0.3 k3d-salut-agent-0
172.18.0.4 k3d-salut-server-0
172.18.0.2 k3d-salut-agent-1

BinaryData
====

Events:  <none>

As you can see, host.k3d.internal doesn't exist.

What did you expect to happen

host.k3d.internal can be reached.

Which OS & Architecture

Which version of k3d

k3d version v5.2.2 k3s version v1.21.7-k3s1 (default)

Which version of docker

docker version:

Client: Docker Engine - Community
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.16.12
 Git commit:        e91ed57
 Built:             Mon Dec 13 11:45:27 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.12
  Git commit:       459d0df
  Built:            Mon Dec 13 11:43:36 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.7.1-docker)
  scan: Docker Scan (Docker Inc., v0.9.0)

Server:
 Containers: 5
  Running: 4
  Paused: 0
  Stopped: 1
 Images: 21
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-94-generic
 Operating System: Ubuntu 18.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 23.4GiB
 Name: euterpe
 ID: DLU7:DDK5:CEBP:25LX:37WC:DOZ2:YWZ2:4YFP:SYXH:ITX4:AKHR:HCQ5
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 54
  Goroutines: 56
  System Time: 2022-01-14T22:19:27.850434788+01:00
  EventsListeners: 0
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  registry.localhost:5000
  127.0.0.0/8
 Live Restore Enabled: false
iwilltry42 commented 2 years ago

Hi @jeusdi , thanks for opening this issue! Do you have the log output of your k3d cluster create command? The cluster creation should fail if k3d fails to inject that hosts entry. Did you restart the cluster or the docker service?

deyanp commented 1 year ago

@iwilltry42 I have the same problem, and also in my case the config map does not have the host.k3d.internal injected, I have only

172.20.0.2 k3d-c2d-dev-k8s-server-0

my cluster is running (re-created yesterday), I did restart the laptop, what can I do to troubleshoot this? Should I recreate the cluster (creation is always successful) and how can I get creation logs?

deyanp commented 1 year ago

I recreated the cluster, and then the coredns host entry is there. It seems that the host entry "disappears" upon reboot of the host machine - can it be re-injected upon reboot as well?

Creating the k3d cluster k3d-c2d-dev-k8s ...
INFO[0000] Prep: Network                                
INFO[0000] Created network 'k3d-c2d-dev-k8s'            
INFO[0000] Created image volume k3d-c2d-dev-k8s-images  
INFO[0000] Starting new tools node...                   
INFO[0000] Starting Node 'k3d-c2d-dev-k8s-tools'        
INFO[0001] Creating node 'k3d-c2d-dev-k8s-server-0'     
INFO[0001] Creating LoadBalancer 'k3d-c2d-dev-k8s-serverlb' 
INFO[0001] Using the k3d-tools node to gather environment information 
INFO[0001] HostIP: using network gateway 172.18.0.1 address 
INFO[0001] Starting cluster 'c2d-dev-k8s'               
INFO[0001] Starting servers...                          
INFO[0001] Starting Node 'k3d-c2d-dev-k8s-server-0'     
INFO[0005] All agents already running.                  
INFO[0005] Starting helpers...                          
INFO[0005] Starting Node 'k3d-c2d-dev-k8s-serverlb'     
INFO[0011] Injecting records for hostAliases (incl. host.k3d.internal) and for 3 network members into CoreDNS configmap... 
INFO[0014] Cluster 'c2d-dev-k8s' created successfully!  
INFO[0014] You can now use it like this:                
kubectl cluster-info
deyanp commented 1 year ago

This is definitely an issue - upon restart of the machine the host.k3d.internal is gone.

The workaround is to define the subnet upon cluster creation, e.g. k3d cluster create ... --subnet '172.18.0.0/16' and then instead of using host.k3d.internal to use directly 172.18.0.1 ...

jeremyj563 commented 4 months ago

AFAICT - reproducing this is as easy as k3d cluster stop $cluster_name && k3d cluster start $cluster_name. What's (arguably) worse is that this bug also affects any defined hostAliases!

In other words, after stopping/starting the k3d cluster, none of the host aliases defined in the original k3d config get injected. They only get injected upon starting the cluster the first time during k3d cluster create

Edit: Just found https://github.com/k3d-io/k3d/issues/1112 and https://github.com/k3d-io/k3d/issues/1221

iwilltry42 commented 4 months ago

Host entries to the CoreDNS config are now managed via the coredns-custom configmap as per https://github.com/k3d-io/k3d/pull/1453 so they survive restarts of the cluster and host system.

This is released in https://github.com/k3d-io/k3d/releases/tag/v5.7.0

mcollins123 commented 3 months ago

Incase anyone comes across this thread thinking the problem has been resolved, the merge above was backed out in 5.7.1 so this issue remains. It is also being tracked on #1221 .