ormergi commented 3 years ago

What happened: We use KinD with DinD setup using Prow on our CI environment for end-to-end tests with Kubevirt. And occasionally encounter etcdserver: request timed out errors when we try to create a cluster object (e.g: CSR, Secret, Kubevirt VM)

We tried what suggested on this issue https://github.com/kubernetes-sigs/kind/issues/717 which also suggests to increase fs.inotify.max_user_watches, but it didnt worked for us and we still see those errors.

I understand this is probably not a very actionable bug report, we try to understand what is the root cause of this.

What you expected to happen: No etcd timeout errors.

How to reproduce it (as minimally and precisely as possible): Its pretty hard to reproduce it manually, but we do see it happen on 50% of the prow jobs.

Anything else we need to know?:

On etcd pods logs we see ..took to long to execute warning all over consistently.
We run KinD clusters on DinD setup using Prow following this https://github.com/kubernetes-sigs/kind/issues/303
A prow job runs install sriov-network-operator and kubevirt operator and runs tests.

Environment:

kind version: (use kind version): kind v0.7.0 go1.13.6 linux/amd64

Kubernetes version: (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2020-01-14T00:09:19Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}

Docker version: (use docker info): We use DinD setup: Docker version on the host:

Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins: 
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Init Binary: /usr/libexec/docker/docker-init-current
containerd version:  (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: 66aedde759f33c190954815fb765eedc1d782dd9 (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: fec3683b971d9c3ef73f284f176672c44b448662 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
WARNING: You're not using the default seccomp profile
Profile: /etc/docker/seccomp.json
selinux
Kernel Version: 3.10.0-1127.19.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 16
Total Memory: 110 GiB
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Registries: docker.io (secure)

Docker version on prow job pod:


Server Version: 18.09.6
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-1127.19.1.el7.x86_64
Operating System: Debian GNU/Linux 9 (stretch) (containerized)
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 110GiB
Name: e902832b-23f7-11eb-863d-0a580a830d37
ID: OKCK:VVFR:JZS4:3X2Q:56C7:Z76F:23LS:2P64:R4E2:YAHD:NO5F:J5BQ
Docker Root Dir: /docker-graph
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
docker-mirror.kubevirt-prow.svc:5000
127.0.0.0/8
Registry Mirrors:
http://docker-mirror.kubevirt-prow.svc:5000/
Live Restore Enabled: false
Product License: Community Engine

WARNING: bridge-nf-call-iptables is disabled


- OS (e.g. from `/etc/os-release`): 
Host: Centos 7 3.10.0-1127.19.1.el7.x86_64

oshoval commented 3 years ago

Please see https://github.com/kubernetes/kubernetes/issues/70082#issuecomment-433604939 and https://github.com/kubevirt/kubevirt/issues/4519#issuecomment-724913645 which is the issue in our repo.

I suspect our HD is too slow for etcd,

717 as well suggests to check metrics in some of the comments.

BenTheElder commented 3 years ago

you can try the hacks outlined in https://github.com/kubernetes-sigs/kind/issues/845

you should also check #303 as a general rule when trying to do kind in kubernetes, but disk speed is probably just your host. etcd is pretty I/O bound.

BenTheElder commented 3 years ago

Note that kind's own CI runs in DIND on Prow. We run on fast GCE PD SSD nodes though (because all of CI does, for better build performance etc.).

Even if your disk is itself fast enough for etcd., it may not be fast enough if you run N binpacked kind clusters. There's not a lot actionable for us here, we have an existing issue about providing in-memory as an option (with associated tradeoffs, see previous comment links).

Feel free to continue discussing, but we'll use the other issue to continue tracking "faster but less safe etcd for CI".

ormergi commented 3 years ago

Hi Ben, Thanks so much for responding :)

The disks on our CI nodes running healthy, no SSD though. Is it possible to fetch etcd metrics directly form etcd pods so we could get more details?

you can try the hacks outlined in #845

This is great! we are defiantly going to try that. So if I understand correctly we need to patch kind-config.yaml and mount in memory emptyDir on the Prow job pod yaml? Is there something we need to configure on the host side?

you should also check #303 as a general rule when trying to do kind in kubernetes, but disk speed is probably just your host. etcd is pretty I/O bound.

Yep, we dont nests Kind clusters, everything run on top of Openshift cluster using Prow

Note that kind's own CI runs in DIND on Prow. We run on fast GCE PD SSD nodes though (because all of CI does, for better build performance etc.).

Is there a job on KIND e2e that runs with etcd in-memory?

Based on related issues: https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#what-does-the-etcd-warning-apply-entries-took-too-long-mean

Even if your disk is itself fast enough for etcd., it may not be fast enough if you run N binpacked kind clusters.

Does it mean that even if we use SSD on our CI nodes we could still encounter those errors, because we run DinD pod inside a K8S cluster (Openshift in our case)?

There's not a lot actionable for us here, we have an existing issue about providing in-memory as an option (with associated tradeoffs, see previous comment links).

Feel free to continue discussing, but we'll use the other issue to continue tracking "faster but less safe etcd for CI".

BenTheElder commented 3 years ago

The disks on our CI nodes running healthy, no SSD though.

Healthy or not, they may not have enough IOPS / throughput for N clusters / builds / ... at once. This is a fairly common problem with prow, versus a build environment where you have one job to a machine (e.g. one jenkins runner per VM), when you bin-pack jobs/pods you can allocate for CPU, RAM, disk space, ... but not I/O.

Is it possible to fetch etcd metrics directly form etcd pods so we could get more details?

You should be able to curl the metrics endpoint I think.

So if I understand correctly we need to patch kind-config.yaml and mount in memory emptyDir on the Prow job pod yaml?

That, or manage a tmpfs in your script (in place of the emptyDir mount). There should be samples discussed in the issue.

Yep, we dont nests Kind clusters, everything run on top of Openshift cluster using Prow

You're nesting kind within openshift=kubernetes though, which has the problems in #303.

Is there a job on KIND e2e that runs with etcd in-memory?

No, not currently. A lot of kind jobs run one to a machine though because they requesting > N/2 CPU (for Kubernetes build purposes, not needed by kind itself).

Does it mean that even if we use SSD on our CI nodes we could still encounter those errors, because we run DinD pod inside a K8S cluster (Openshift in our case)?

Well, even on the fastest disk in the world if you try to run :infinity: etcd clusters on one disk you're going to run out of I/O eventually. Moving to tmpfs/memory shifts the problem around but bandwidth / I/O isn't unlimited there either.

dind isn't related, just the N to a disk issue.

BenTheElder commented 3 years ago

Should add: if in-memory etcd is the successful route for you, please let us know. We're considering what a native feature for this and other "no persistence but faster" storage hacks might look like.

ormergi commented 3 years ago

Hi @BenTheElder After we run with our KIND setup on CI with etcd in memory I can confirm that there is a significant improvement in the cluster overall performance and the time it takes for us to create it, etcd is healthy and we no longer see the ..operation took too long warnings in the logs. Please let me know if I could help somehow with the native feature for this that you mentioned.

I would like to thank you I appreciate the support and quick response!!! :grin: :rocket:

kubernetes-sigs / kind

etcd timeout errors on DinD setup using Prow #1922

717 as well suggests to check metrics in some of the comments.