loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.65k stars 421 forks source link

coredns CrashLoopBackOff - Syntax error: Unexpected token 'stacktrace', expecting 'consolidate' #1151

Open everesio opened 1 year ago

everesio commented 1 year ago

What happened?

coredns is in the CrashLoopBackOff due to

plugin/errors: /etc/coredns/Corefile:3 - Syntax error: Unexpected token 'stacktrace', expecting 'consolidate'

What did you expect to happen?

coredns is running

How can we reproduce it (as minimally and precisely as possible)?

coredns/coredns:1.8.7 - plugin mis-configuration

https://coredns.io/plugins/errors/

errors {
    stacktrace
    consolidate DURATION REGEXP [LEVEL]
}

Anything else we need to know?

coredns configmap

apiVersion: v1
data:
  Corefile: |-
    .:1053 {
        errors {
            stacktrace
        }
        health
        ready
        rewrite name regex .*\.nodes\.vcluster\.com kubernetes.default.svc.cluster.local
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/NodeHosts {
            ttl 60
            reload 15s
            fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        loadbalance
    }

    import /etc/coredns/custom/*.server
  NodeHosts: 10.240.25.196 weird-wallaby-757dd587fb-9lp27.nodes.vcluster.com
kind: ConfigMap

Host cluster Kubernetes version

```console $ kubectl version Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.3", GitCommit:"c92036820499fedefec0f847e2054d824aea6cd1", GitTreeState:"clean", BuildDate:"2021-10-27T18:41:28Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.8", GitCommit:"fdc77503e954d1ee641c0e350481f7528e8d068b", GitTreeState:"clean", BuildDate:"2022-11-09T13:31:40Z", GoVersion:"go1.18.8", Compiler:"gc", Platform:"linux/amd64"} ```

Host cluster Kubernetes distribution

``` k8s on openstack, cilium CNI v1.12, OS Flatcar ```

vlcuster version

```console $ vcluster --version vcluster version 0.15.5 ```

Vcluster Kubernetes distribution(k3s(default)), k8s, k0s)

``` k8s ```

OS and Arch

``` OS: flatcar Arch: amd64 ```
ishankhare07 commented 1 year ago

Hi @everesio , have you checked your values files going along with this vcluster creation. By default vcluster 0.15.5 is supposed to install coredns/coredns:1.10.1 which I just verified.

everesio commented 1 year ago
➜  vcluster git:(main) ✗ vcluster create my-vcluster --distro k8s --expose        
info   failed to find IPv6 service CIDR: couldn't find host cluster Service CIDR ("Service "" is invalid: spec.clusterIPs[0]: Invalid value: []string{"2001:DB8::1"}: IPv6 is not configured on this cluster")
info   Create vcluster my-vcluster...
info   execute command: helm upgrade my-vcluster /tmp/vcluster-k8s-0.15.5.tgz-2716274335 --kubeconfig /tmp/3719326450 --namespace vcluster-my-vcluster --install --repository-config='' --values /tmp/4184690661
done √ Successfully created virtual cluster my-vcluster in namespace vcluster-my-vcluster
info   Waiting for vcluster to come up...
info   Using vcluster my-vcluster load balancer endpoint: 45.94.111.198
done √ Switched active kube context to vcluster_my-vcluster_vcluster-my-vcluster_default
- Use `vcluster disconnect` to return to your previous kube context
- Use `kubectl get namespaces` to access the vcluster
➜  vcluster git:(main) ✗ kubectl get pods -n vcluster-my-vcluster
NAME                                                   READY   STATUS             RESTARTS        AGE
coredns-6dbc949d9c-7mxm6-x-kube-system-x-my-vcluster   0/1     CrashLoopBackOff   4 (59s ago)     2m36s
my-vcluster-85cd96b958-wnnd9                           1/1     Running            1 (2m37s ago)   3m57s
my-vcluster-api-7774cd45c4-qcczs                       1/1     Running            2 (3m13s ago)   3m57s
my-vcluster-controller-6ddf8c9f47-csrc8                1/1     Running            1 (2m23s ago)   3m57s
my-vcluster-etcd-0                                     1/1     Running            0               3m57s

➜  vcluster git:(main) ✗ kubectl get pods -n vcluster-my-vcluster coredns-6dbc949d9c-7mxm6-x-kube-system-x-my-vcluster -o yaml | grep image
    image: coredns/coredns:1.8.7
    imagePullPolicy: IfNotPresent
    image: docker.io/coredns/coredns:1.8.7
    imageID: docker.io/coredns/coredns@sha256:58508c172b14716350dc5185baefd78265a703514281d309d1d54aa1b721ad68
everesio commented 1 year ago

Default image is "1.10.1" but the used one is "1.8.7"

    for _, image := range constants.CoreDNSVersionMap {
        if contains(images, image) {
            continue
        }

        images = append(images, image)
    }

    images = append(images, coredns.DefaultImage)
var CoreDNSVersionMap = map[string]string{
    "1.25": "coredns/coredns:1.9.3",
    "1.24": "coredns/coredns:1.8.7",
    "1.23": "coredns/coredns:1.8.6",
    "1.22": "coredns/coredns:1.8.4",
    "1.21": "coredns/coredns:1.8.3",
    "1.20": "coredns/coredns:1.8.0",
    "1.19": "coredns/coredns:1.6.9",
    "1.18": "coredns/coredns:1.6.9",
    "1.17": "coredns/coredns:1.6.9",
    "1.16": "coredns/coredns:1.6.3",
}
const (
    DefaultImage          = "coredns/coredns:1.10.1"
ishankhare07 commented 1 year ago

@everesio I see, this seems to be because of the host kubernetes version being 1.24. Let me give it a try on that and get back to you. Meanwhile you can always override the vcluster coreDNS values to use a newer version Image

coredns:
  image: coredns/coredns:1.9.3 // or some other image that works for you

Meanwhile I also checked the current tests in the coreDNS repo here

and it seems "just stacktrace" without consolidate part is also a valid configuration. Still I'll test this and get back to you soon

dilshad18 commented 1 year ago

this also works

coredns:
  image: coredns/coredns:1.10.1
ishankhare07 commented 1 year ago

@everesio Hi, I have an update about this. I can reproduce this issue with a kind cluster on 1.24 k8s. It seems like the older image 1.8.7 when compiled did not keep the section of Consolidate to be optional. I can confirm this because if I run the 1.9.4 image it works fine with just the stacktrace option. I will open a PR to fix this in the configmap soon with the new version of coreDNS. Meanwhile can you confirm if overriding coredns.image seems to be working for you or not?

Update: I have tested the following working versions for now

Please note that the latest release of coreDNS which is 1.11.0 right now is NOT working and the support for the same is being worked at over here - #1152

everesio commented 1 year ago

The image override works, but it would be beneficial if the CLI could operate with default settings.

omichels commented 1 year ago

Just a short note from me about reproduction of this bug:

Openshift / Kubernetes v1.23.5+3afdacb Tried all coredns images Versions from 1.6.x through 1.11.x to no avail.

FabianKramm commented 1 year ago

@omichels 1.23 is not officially supported anymore (we can try to make a fix, but I'm not sure when we will have time for this), is this also an issue for >1.24 clusters?