kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.41k stars 1.55k forks source link

etcd timeout errors on DinD setup using Prow #1922

Closed ormergi closed 3 years ago

ormergi commented 3 years ago

What happened: We use KinD with DinD setup using Prow on our CI environment for end-to-end tests with Kubevirt. And occasionally encounter etcdserver: request timed out errors when we try to create a cluster object (e.g: CSR, Secret, Kubevirt VM)

We tried what suggested on this issue https://github.com/kubernetes-sigs/kind/issues/717 which also suggests to increase fs.inotify.max_user_watches, but it didnt worked for us and we still see those errors.

I understand this is probably not a very actionable bug report, we try to understand what is the root cause of this.

What you expected to happen: No etcd timeout errors.

How to reproduce it (as minimally and precisely as possible): Its pretty hard to reproduce it manually, but we do see it happen on 50% of the prow jobs.

Anything else we need to know?:

Environment:

WARNING: bridge-nf-call-iptables is disabled


- OS (e.g. from `/etc/os-release`): 
Host: Centos 7 3.10.0-1127.19.1.el7.x86_64
oshoval commented 3 years ago

Please see https://github.com/kubernetes/kubernetes/issues/70082#issuecomment-433604939 and https://github.com/kubevirt/kubevirt/issues/4519#issuecomment-724913645 which is the issue in our repo.

I suspect our HD is too slow for etcd,

717 as well suggests to check metrics in some of the comments.

BenTheElder commented 3 years ago

you can try the hacks outlined in https://github.com/kubernetes-sigs/kind/issues/845

you should also check #303 as a general rule when trying to do kind in kubernetes, but disk speed is probably just your host. etcd is pretty I/O bound.

BenTheElder commented 3 years ago

Note that kind's own CI runs in DIND on Prow. We run on fast GCE PD SSD nodes though (because all of CI does, for better build performance etc.).

Based on related issues: https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#what-does-the-etcd-warning-apply-entries-took-too-long-mean

Even if your disk is itself fast enough for etcd., it may not be fast enough if you run N binpacked kind clusters. There's not a lot actionable for us here, we have an existing issue about providing in-memory as an option (with associated tradeoffs, see previous comment links).

Feel free to continue discussing, but we'll use the other issue to continue tracking "faster but less safe etcd for CI".

ormergi commented 3 years ago

Hi Ben, Thanks so much for responding :)

The disks on our CI nodes running healthy, no SSD though. Is it possible to fetch etcd metrics directly form etcd pods so we could get more details?

you can try the hacks outlined in #845

This is great! we are defiantly going to try that. So if I understand correctly we need to patch kind-config.yaml and mount in memory emptyDir on the Prow job pod yaml? Is there something we need to configure on the host side?

you should also check #303 as a general rule when trying to do kind in kubernetes, but disk speed is probably just your host. etcd is pretty I/O bound.

Yep, we dont nests Kind clusters, everything run on top of Openshift cluster using Prow

Note that kind's own CI runs in DIND on Prow. We run on fast GCE PD SSD nodes though (because all of CI does, for better build performance etc.).

Is there a job on KIND e2e that runs with etcd in-memory?

Based on related issues: https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#what-does-the-etcd-warning-apply-entries-took-too-long-mean

Even if your disk is itself fast enough for etcd., it may not be fast enough if you run N binpacked kind clusters.

Does it mean that even if we use SSD on our CI nodes we could still encounter those errors, because we run DinD pod inside a K8S cluster (Openshift in our case)?

There's not a lot actionable for us here, we have an existing issue about providing in-memory as an option (with associated tradeoffs, see previous comment links).

Feel free to continue discussing, but we'll use the other issue to continue tracking "faster but less safe etcd for CI".

BenTheElder commented 3 years ago

The disks on our CI nodes running healthy, no SSD though.

Healthy or not, they may not have enough IOPS / throughput for N clusters / builds / ... at once. This is a fairly common problem with prow, versus a build environment where you have one job to a machine (e.g. one jenkins runner per VM), when you bin-pack jobs/pods you can allocate for CPU, RAM, disk space, ... but not I/O.

Is it possible to fetch etcd metrics directly form etcd pods so we could get more details?

You should be able to curl the metrics endpoint I think.

So if I understand correctly we need to patch kind-config.yaml and mount in memory emptyDir on the Prow job pod yaml?

That, or manage a tmpfs in your script (in place of the emptyDir mount). There should be samples discussed in the issue.

Yep, we dont nests Kind clusters, everything run on top of Openshift cluster using Prow

You're nesting kind within openshift=kubernetes though, which has the problems in #303.

Is there a job on KIND e2e that runs with etcd in-memory?

No, not currently. A lot of kind jobs run one to a machine though because they requesting > N/2 CPU (for Kubernetes build purposes, not needed by kind itself).

Does it mean that even if we use SSD on our CI nodes we could still encounter those errors, because we run DinD pod inside a K8S cluster (Openshift in our case)?

Well, even on the fastest disk in the world if you try to run :infinity: etcd clusters on one disk you're going to run out of I/O eventually. Moving to tmpfs/memory shifts the problem around but bandwidth / I/O isn't unlimited there either.

dind isn't related, just the N to a disk issue.

BenTheElder commented 3 years ago

Should add: if in-memory etcd is the successful route for you, please let us know. We're considering what a native feature for this and other "no persistence but faster" storage hacks might look like.

ormergi commented 3 years ago

Hi @BenTheElder After we run with our KIND setup on CI with etcd in memory I can confirm that there is a significant improvement in the cluster overall performance and the time it takes for us to create it, etcd is healthy and we no longer see the ..operation took too long warnings in the logs. Please let me know if I could help somehow with the native feature for this that you mentioned.

I would like to thank you I appreciate the support and quick response!!! :grin: :rocket: