canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.41k stars 767 forks source link

coredns restarts when LXD is initialized #1823

Closed davigar15 closed 2 years ago

davigar15 commented 3 years ago

Environment

Steps to reproduce:

ubuntu@rare-avocet:~$ sudo snap install microk8s --classic
Download snap "microk8s" (1856) from channel "1.19/stable"                                                                                                                                                53% 23.0MB/s 4.4Download snap "microk8s" (1856) from channel "1.19/stable"                                      54% 23.0MB/s microk8s (1.19/stable) v1.19.5 from Canonical✓ installed
ubuntu@rare-avocet:~$ microk8s.status --wait-ready
microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    ha-cluster           # Configure high availability on the current node
  disabled:
    ambassador           # Ambassador API Gateway and Ingress
    cilium               # SDN, fast with full network policy
    dashboard            # The Kubernetes dashboard
    dns                  # CoreDNS
    fluentd              # Elasticsearch-Fluentd-Kibana logging and monitoring
    gpu                  # Automatic enablement of Nvidia CUDA
    helm                 # Helm 2 - the package manager for Kubernetes
    helm3                # Helm 3 - Kubernetes package manager
    host-access          # Allow Pods connecting to Host services smoothly
    ingress              # Ingress controller for external access
    istio                # Core Istio service mesh services
    jaeger               # Kubernetes Jaeger operator with its simple config
    knative              # The Knative framework on Kubernetes.
    kubeflow             # Kubeflow for easy ML deployments
    linkerd              # Linkerd is a service mesh for Kubernetes and other frameworks
    metallb              # Loadbalancer for your Kubernetes cluster
    metrics-server       # K8s Metrics Server for API access to service metrics
    multus               # Multus CNI enables attaching multiple network interfaces to pods
    prometheus           # Prometheus operator for monitoring and logging
    rbac                 # Role-Based Access Control for authorisation
    registry             # Private image registry exposed on localhost:32000
    storage              # Storage class; allocates storage from host directory
ubuntu@rare-avocet:~$ microk8s.enable storage dns
Enabling default storage class
deployment.apps/hostpath-provisioner created
storageclass.storage.k8s.io/microk8s-hostpath created
serviceaccount/microk8s-hostpath created
clusterrole.rbac.authorization.k8s.io/microk8s-hostpath created
clusterrolebinding.rbac.authorization.k8s.io/microk8s-hostpath created
Storage will be available soon
Enabling DNS
Applying manifest
serviceaccount/coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
clusterrole.rbac.authorization.k8s.io/coredns created
clusterrolebinding.rbac.authorization.k8s.io/coredns created
Restarting kubelet
DNS is enabled
ubuntu@rare-avocet:~$ sleep 30 && microk8s.kubectl get all -A
NAMESPACE     NAME                                          READY   STATUS    RESTARTS   AGE
kube-system   pod/coredns-86f78bb79c-vsqhd                  1/1     Running   0          3m18s
kube-system   pod/hostpath-provisioner-5c65fbdb4f-ql8vm     1/1     Running   0          3m17s
kube-system   pod/calico-kube-controllers-847c8c99d-6ktqq   1/1     Running   0          3m55s
kube-system   pod/calico-node-gtkc7                         1/1     Running   0          3m52s

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP                  4m2s
kube-system   service/kube-dns     ClusterIP   10.152.183.10   <none>        53/UDP,53/TCP,9153/TCP   3m18s

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/calico-node   1         1         1       1            1           kubernetes.io/os=linux   4m3s

NAMESPACE     NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/coredns                   1/1     1            1           3m18s
kube-system   deployment.apps/hostpath-provisioner      1/1     1            1           3m19s
kube-system   deployment.apps/calico-kube-controllers   1/1     1            1           4m3s

NAMESPACE     NAME                                                DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/coredns-86f78bb79c                  1         1         1       3m18s
kube-system   replicaset.apps/hostpath-provisioner-5c65fbdb4f     1         1         1       3m19s
kube-system   replicaset.apps/calico-kube-controllers-847c8c99d   1         1         1       3m55s

ubuntu@rare-avocet:~$ lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: no
Do you want to configure a new storage pool? (yes/no) [default=yes]: 
Name of the new storage pool [default=default]: 
Name of the storage backend to use (btrfs, dir, lvm, ceph) [default=btrfs]: 
Create a new BTRFS pool? (yes/no) [default=yes]: 
Would you like to use an existing empty block device (e.g. a disk or partition)? (yes/no) [default=no]: 
Size in GB of the new loop device (1GB minimum) [default=9GB]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to create a new local network bridge? (yes/no) [default=yes]: 
What should the new bridge be called? [default=lxdbr0]: 
What IPv4 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: 
What IPv6 address should be used? (CIDR subnet notation, “auto” or “none”) [default=auto]: none
Would you like LXD to be available over the network? (yes/no) [default=no]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: yes
config: {}
networks:
- config:
    ipv4.address: auto
    ipv6.address: none
  description: ""
  name: lxdbr0
  type: ""
storage_pools:
- config:
    size: 9GB
  description: ""
  name: default
  driver: btrfs
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      network: lxdbr0
      type: nic
    root:
      path: /
      pool: default
      type: disk
  name: default
cluster: null
ubuntu@rare-avocet:~$ microk8s.kubectl get all -A
NAMESPACE     NAME                                          READY   STATUS    RESTARTS   AGE
kube-system   pod/hostpath-provisioner-5c65fbdb4f-ql8vm     1/1     Running   0          3m19s
kube-system   pod/calico-kube-controllers-847c8c99d-6ktqq   1/1     Running   0          3m57s
kube-system   pod/calico-node-gtkc7                         1/1     Running   0          3m54s
kube-system   pod/coredns-86f78bb79c-vsqhd                  0/1     Running   0          3m20s

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP                  4m4s
kube-system   service/kube-dns     ClusterIP   10.152.183.10   <none>        53/UDP,53/TCP,9153/TCP   3m20s

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/calico-node   1         1         1       1            1           kubernetes.io/os=linux   4m5s

NAMESPACE     NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/hostpath-provisioner      1/1     1            1           3m21s
kube-system   deployment.apps/calico-kube-controllers   1/1     1            1           4m5s
kube-system   deployment.apps/coredns                   0/1     1            0           3m20s

NAMESPACE     NAME                                                DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/hostpath-provisioner-5c65fbdb4f     1         1         1       3m21s
kube-system   replicaset.apps/calico-kube-controllers-847c8c99d   1         1         1       3m57s
kube-system   replicaset.apps/coredns-86f78bb79c                  1         1         0       3m20s

If you see the logs, after executing the lxd init, the coredns goes down.

This is happening only with microk8s and those plugins enabled, without any other workload deployed.

Additionally to this problem, when a juju controller is deployed (and some k8s charms), this issue is causing all pods to go to an Unknown state, and then we need to wait for some minutes for everything to recover.

Also, this same behavior happens when instead of LXD, we initialize Microstack. It's like something is wrong when something else touches the networking of the host.

I hope this helps.

inspection-report-20201215_130842.tar.gz

balchua commented 3 years ago

@davigar15 i see that the apiservice kicker triggered a restart of the apiservice multiple times . This happens when the kicker detects a network change. Can you check if this one works for you? https://github.com/ubuntu/microk8s/issues/1822#issuecomment-745335208

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.