kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.32k stars 39.46k forks source link

IPv6: conntrack failures deleting endpoints seen during E2E tests #63208

Closed pmichali closed 5 years ago

pmichali commented 6 years ago

When running E2E tests on an IPv6 cluster, kube-proxy is showing errors from conntrack, when trying to delete endpoints. For example:

E0425 12:22:48.879901 1 proxier.go:603] Failed to delete e2e-tests-nettest-k6ngh/node-port-service:udp endpoint connections, error: error deleting conntrack entries for UDP peer {fd00:77:30::5328, fd00:77:20:0:3::d}, error: conntrack command returned: "conntrack v1.4.2 (conntrack-tools): mismatched address family\nTry `conntrack -h' or 'conntrack --help' for more information.\n", error message: exit status 2

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug /area ipv6 /sig network

What happened: During the E2E test runs, errors are seen in kube-proxy, when cleaning up the endpoints. Concern is that this is possibly affecting health of kube-dns (as see it restarting numberous tmes).

What you expected to happen:

No errors, when deleting endpoints.

How to reproduce it (as minimally and precisely as possible):

Start up an IPv6 based Kubernetes cluster, and then run E2E tests for IPv6 cases. For example:

go run hack/e2e.go -- --provider=local --v 4 --test --test_args="--ginkgo.focus=Networking|Services --ginkgo.skip=IPv4|DNS|Networking-Performance|Federation|functioning\sNodePort|preserve\ssource\spod --num-nodes=2"

Anything else we need to know?:

In looking at the conntrack call, it appears that these are the arguments being passed in:

-D --orig-dst fd00:77:30::1374 --dst-nat fd00:77:20:0:2::4 -p udp -f ipv6

From the conntrack examples, it looks like, for IPv6, the --dns-nat option should be the IP and port.

I'll be modifying the kube-proxy code and submitting a PR to attempt to correct this issue.

Environment:

pmichali commented 6 years ago

/assign

pmichali commented 6 years ago

/assign pmichali

pmichali commented 6 years ago

Correction: After more research I found out two things. First, the two args, --orig-dst and --dst-nat are indeed IPv6 addresses (no port). From what I can tell, it appears that the conntrack delete is using some IPv4 default for the (unspecified) --orig-src argument and then rejecting the command. If one specifies an IPv6 address for --orig-src the command is no longer rejected (but may not be doing the right thing).

In addition, the owner of the conntrack code, Pablo from the netfilter team, says that support for IPv6 NAT will be in contrack 1.4.5. Currently, my Kubernetes repo is using 1.4.2, and on my host Ubuntu 16.04, I can install 1.4.3. There is a patch available with the fix.

I'm asking Pablo if, in addition to deleting the connection properly, the patch will also be accepting the --orig-dst and --dst-nat arguments, as currently, just specifying those fails. I just want to make sure there aren't two problems here (looking at the default value for --orig-src and rejecting the command due to family mismatch).

Overall, this will be an issue, as it sounds like the connection deletes will be failing, until 1.4.5 is available, and we'll need to figure out the logics of getting that into OSes for use in Kubernetes. I'm not sure how much that failure is affecting the tests, but I'm concerned that it may be causing kube-dns to fail to respond to liveness/readiness probes, causing intermittent test failures.

I'm trying to build conntrack, with the latest from master, which has the fix for IPv6 NAT (commit 29b390a2), though I'm having issues getting all the needed libraries installed.

MrHohn commented 6 years ago

cc @grayluck

grayluck commented 6 years ago

Ack.

pmichali commented 6 years ago

cc @bowei

pmichali commented 6 years ago

With both kube-dns and coredns, during test teardown, I see the same failure of conntrack to remove the connection. Here is logging with additional debug:

I0507 17:39:07.949529       1 conntrack.go:104] PCM: [-D --orig-dst fd00:77:30::cb30 --dst-nat fd00:77:20:0:3::3e -p udp -f ipv6] origin=fd00:77:30::cb30 dest=fd00:77:20:0:3::3e
E0507 17:39:07.949574       1 proxier.go:604] Failed to delete e2e-tests-nettest-c2f9b/session-affinity-service:udp endpoint connections ({Endpoint:[fd00:77:20:0:3::3e]:8081 ServicePortName:e2e-tests-nettest-c2f9b/session-affinity-service:udp}, fd00:77:30::cb30:90/UDP), error: error deleting conntrack entries for UDP peer {fd00:77:30::cb30, fd00:77:20:0:3::3e} with params [-D --orig-dst fd00:77:30::cb30 --dst-nat fd00:77:20:0:3::3e -p udp -f ipv6], error: conntrack command /usr/sbin/conntrack returned: "conntrack v1.4.2 (conntrack-tools): mismatched address family\nTry `conntrack -h' or 'conntrack --help' for more information.\n", error message: exit status 2

The system is using conntrack 1.4.2.

pmichali commented 6 years ago

I temporary used conntrack 1.4.5 and kube-dns does not show conntrack errors during deletion of connections, and I don't (so far), see kube-dns restarts.

Pablo indicated that there was no workaround that could be used for 1.4.2, and there are no binaries for conntrack 1.4.5 available. Need to understand how to deal with this.

pmichali commented 6 years ago

Well, I manually built conntrack 1.4.5 and installed it onto kube-proxy and then ran IPv6 E2E tests. I no longer see conntrack throwing an error on connection delete. However, I still see kube-dns restarts during the test runs (most of the time), and some tests intermittently failing (a coworker thinks it is because kube-dns is not responding). I'll try to see if I can confirm that the connections are being deleted, by running only one test and try monitoring.

Regarding this issue, I'm not sure how we proceed with applying the fix. Do we include the conntrack binaries with the Dockerfile for kube-proxy so that it can be added and included? Do we do that for the build/debain-iptables area and/or the build/debian-hyperkube-base area? Or do we wait, until binaries are available for conntrack 1.4.5 (and they how do we specify that version)?

Regarding the kube-dns (and coredns) restarts during the tests, I've run out of ideas on how to troubleshoot that issue. Any advice? Should I create a separate bug for that issue?

pmichali commented 6 years ago

Conntrack 1.4.5 appears to resolve this issue. Should we close this issue, noting that we need the newer version to eliminate the failure on conntrack delete operations, or do we leave this open, until conntrack is updated (not sure when, or the requirements to update, as binaries for conntrack 1.4.5 are not available)?

grayluck commented 6 years ago

Sorry for the delay Paul and thank you for your hard work. Nice debugging! Please leave the issue open. It's not resolved yet. Could you please tell me in which pod (and using which image) you upgrade the conntrack? So maybe we can rebuild an image with updated conntrack.

pmichali commented 6 years ago

Sure,

I was using 1.10.2 Kubernetes, had built conntrack 1.4.5 from source on my host, and copied it and library dependencies into the kube-proxy pod. I see there are two Dockerfiles for building kube-proxy, not sure which one (or if both) need to have conntrack updated.

Regards,

Paul Michali (pcm)

On May 16, 2018, at 12:02 AM, Yankai Zhang notifications@github.com wrote:

Sorry for the delay Paul and thank you for your hard work. Nice debugging! Please leave the issue open. It's not resolved yet. Could you please tell me in which pod (and using which image) you upgrade the conntrack? So maybe we can rebuild an image with updated conntrack.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/63208#issuecomment-389387501, or mute the thread https://github.com/notifications/unsubscribe-auth/ACqJx2a-hO5MP5hkFwDhsMp1lQAvpH_8ks5ty6TSgaJpZM4TmNn1.

grayluck commented 6 years ago

Thanks for that. I will look into it.

grayluck commented 6 years ago

Paul, I think I found why. Since https://github.com/kubernetes/kubernetes/pull/52744, kube-proxy base image is updated to stretch. Could you please gets into your kube-proxy pod and run apt-get update? It should tells you which repo you are using. If it says jessie-updates, then just update the kube version and we should be fine.

https://github.com/kubernetes/kubernetes/blob/release-1.10/build/debian-base/Makefile

According to https://packages.debian.org/stretch/conntrack, conntrack on debian stretch is at 1.4.4.

pmichali commented 6 years ago

@grayluck Sorry for the delay. I've been trying to track down another issue with kube-dns restarting. I'll try to check this in the next day or so.

pmichali commented 6 years ago

@grayluck I updated my Kubernetes to 1.10.3, and then started up a cluster on bare-metal, with IPv6 only mode. On kube-proxy, it shows that conntrack is running 1.4.4.

kubectl exec -it kube-proxy-m6sk2 -n kube-system sh
# which conntrack
/usr/sbin/conntrack
# conntrack -v
conntrack v1.4.4 (conntrack-tools): unknown option `-v'
Try `conntrack -h' or 'conntrack --help' for more information.

However, during the test run, we still see the failure.

E0523 13:04:17.905817       1 proxier.go:603] Failed to delete kube-system/kube-dns:dns endpoint connections, error: error deleting conntrack entries for UDP peer {fd00:30::a, fd00:40::2:0:0:13d5}, error: conntrack command returned: "conntrack v1.4.4 (conntrack-tools): mismatched address family\nTry `conntrack -h' or 'conntrack --help' for more information.\n", error message: exit status 2

I know that the conntrack maintainer mentioned that 1.4.5 resolves the issue, and I have verified that manually, but it appears that 1.4.4 does not have the fix.

grayluck commented 6 years ago

Debian does updates conntrack to 1.4.5. https://packages.debian.org/buster/conntrack Unfortunately the earliest version of debian that provides conntrack 1.4.5 is buster, but it's still in test stage. Which means that in months kubernetes will not bump up to a newer stable debian.

This should be the file that serves as the base image of kube-proxy, if you want to make conntrack an exception and update it from other package repos (but I highly don't suggest to do this). https://github.com/kubernetes/kubernetes/blob/master/build/debian-iptables/Dockerfile

By the way, is the failing test one of the tests mentioned here? https://github.com/CiscoSystems/kube-v6-test#included-test-cases

If the failing test blocks your work, let's find a way to skip the test for now and mark it for future verification.

pmichali commented 6 years ago

@grayluck I think that the conntrack deletion failures may be benign, and not contributing to test failures. So I think we can just keep a note to test this, once conntrack 1.4.5 is available.

I have a separate issue about the kube-dns restarts (#63922), and am still working on trying to root cause that issue. I've been using (manually) conntrack 1.4.5 to eliminate it as a cause of the restarts.

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 5 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 5 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/kubernetes/issues/63208#issuecomment-431680864): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.