Kong / kubernetes-ingress-controller

:gorilla: Kong for Kubernetes: The official Ingress Controller for Kubernetes.
https://docs.konghq.com/kubernetes-ingress-controller/
Apache License 2.0
2.21k stars 591 forks source link

Integration tests do not work on MacOS #1626

Closed scottnetlab closed 3 years ago

scottnetlab commented 3 years ago

Is there an existing issue for this?

Current Behavior

When I run integration tests from my MacOS system, I cannot connect to the KIND cluster in which the integration tests deploy.

As per Docker's own documentation, this is not possible without a workaround. See https://docs.docker.com/docker-for-mac/networking/.

Expected Behavior

Integration tests should run on MacOS as they do on Linux to provide a seamless developer experience across platforms.

Steps To Reproduce

1. On MacOS
2. Run `make test.integration`
3. Observe the following error(s):

time="2021-07-30T20:00:58-05:00" level=error msg="admission webhook server stopped" error="listen tcp 172.17.0.1:49023: bind: can't assign requested address"
time="2021-07-30T20:00:58-05:00" level=info msg="WARNING: log stifling has been enabled (experimental)"
time="2021-07-30T20:00:58-05:00" level=info msg="diagnostics server is starting to listen" addr=10256
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    controller_test.go:22: WARNING: error while waiting for http://localhost:10254/healthz: Get "http://localhost:10254/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
panic: controller manager exited with error: unable to start proxy cache server: making HTTP request: Get "http://172.18.0.240:8001/": context deadline exceeded

goroutine 404 [running]:
github.com/kong/kubernetes-ingress-controller/test/integration.deployControllers.func1(0x6b068f0, 0xc000545ec0, 0x0)
        /Users/sdustin/VSCode/scottnetlab-github/kubernetes-ingress-controller/test/integration/suite_test.go:315 +0x13fb
created by github.com/kong/kubernetes-ingress-controller/test/integration.deployControllers
        /Users/sdustin/VSCode/scottnetlab-github/kubernetes-ingress-controller/test/integration/suite_test.go:252 +0x292
FAIL    github.com/kong/kubernetes-ingress-controller/test/integration  116.188s
FAIL

Kong Ingress Controller version

Working from `next` branch.

Kubernetes version

From the integration test:

INFO: container environment ready
INFO: DOCKER=(Docker version 20.10.0, build 7287ab3)
INFO: KIND=(kind v0.11.1 go1.16.4 darwin/amd64)
INFO: KUBECTL=(Client Version: v1.18.2-6+38ac483e736488)
INFO: HELM=(v3.4.1+gc4e7485)

Anything else?

Suggest potentially using --network host with the KIND cluster (is possible) when the underlying OS is Mac. We have done that in some past projects I've worked on to get this working on Mac. See https://github.com/vmware-tanzu-labs/reference-platform-for-kubernetes/blob/develop/Makefile#L33 for an example.

shaneutt commented 3 years ago

Thanks for bringing this up :bow:

The problem stems from the Kong Kubernetes Testing Framework using Kubernetes In Docker (KIND) as the default cluster implementation for tests, and expecting that LoadBalancer type services that use Docker network IP addresses (utilized via MetalLB) will be rout-able from the host. The history behind this comes from the CI running on Linux, and the maintainers up until recently all using Linux development systems: for Linux this is true, as the iptables and routing rules by default will allow the host to access the Docker network when Docker is running natively on the same host, but the same is not true on Mac as it can't run Docker natively and it often connects to Docker in a VM or on an external Linux system.

There are a few different options on how we can fix this:

  1. drop LoadBalancer services as the mechanism to communicate with cluster services during tests and use the Kubernetes API port-forwarding mechanism instead
  2. develop configurations and rules for our kind based testing deployments that will allow MacOS hosts to connect to the LoadBalancer IPs directly
  3. load and run the integration test suite inside of kubernetes itself and drop all external access

Personally I think option 1 is the quickest, option 2 is the most complete (but I don't personally use macos so I'm limited some in my ability to help), and option 3 could really just stop us from ever needing any special rules to run the tests but probably is the longest of the 3 options to implement and introduces some other considerations.

We would like to hear back from you on your thoughts and preference in this matter, let us know if you have a preference as a contributor, and if you are interested in or able to directly contribute to the fix?

As an immediate workaround: In next right now there is some preliminary support for running the integration test suite against a GKE cluster. This is not well matured functionality yet so if you're interested in using it feel free to reach out here for help, and its not ideal because it may end up costing extra money, but just to note that it is technically an option.

scottd018 commented 3 years ago

Thanks for the response @shaneutt ! I believe option 2 is what I'm looking for as it provides the most similar environment to a working installation as well as the most complete coverage for developers for the project. My workaround has simply been to use a Linux VM in AWS, which has been sufficient in the short-term.

I may have some time to contribute a fix for MacOS, however it may be a few weeks, as I've got a lot going on both during and after work :)

I believe that the configuration (whatever that may look like in the end) should end up in the testing framework repo at https://github.com/Kong/kubernetes-testing-framework. Please let me know if I should open an issue over there as well, or if this issue will suffice.

shaneutt commented 3 years ago

Sounds good, no rush on anything. This does belong over in the testing framework, I think we should probably keep this issue open for posterity and in case anyone else comes along with the same problem (because it's not obvious I think for most people to go check out ktf) and feel free to open a new issue in KTF for the specific action (number 2 above). :+1:

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.