kubernetes / ingress-gce

Ingress controller for Google Cloud
Apache License 2.0
1.27k stars 298 forks source link

Add e2e testing #16

Closed bowei closed 6 years ago

bowei commented 6 years ago

From @bprashanth on November 10, 2016 18:48

https://github.com/kubernetes/contrib/issues/1441#issuecomment-256981778

e2e testing: If we could figure out a way to setup an e2e builder that runs https://github.com/kubernetes/kubernetes/blob/master/test/e2e/ingress.go#L58 for every commit just like the cadvisor repo https://github.com/google/cadvisor, that would be great. I'm sure our test-infra people would be more than willing to help with this problem (file an issue like kubernetes/test-infra#939, but more descriptive maybe :)

@porridge fyi

Copied from original issue: kubernetes/ingress-nginx#5

bowei commented 6 years ago

From @aledbf on November 11, 2016 3:23

https://github.com/kubernetes/ingress/pull/12 adds initial e2e structure in directories hack and test.

bowei commented 6 years ago

From @porridge on December 15, 2016 10:32

FWIW, the test-e2e target is broken at the moment. I don't really understand what is the git root directory that the message mentions.

porridge@beczulka:~/Desktop/coding/go/src/k8s.io/ingress$ time make test-e2e 
go get github.com/onsi/ginkgo/ginkgo
2016/12/15 11:28:10 e2e.go:128: Called from invalid working directory: must run from git root directory: /home/porridge/Desktop/coding/go/src/k8s.io/ingress
exit status 1
make: *** [test-e2e] Error 1

real    0m0.756s
user    0m0.636s
sys 0m0.180s
porridge@beczulka:~/Desktop/coding/go/src/k8s.io/ingress$ 
bowei commented 6 years ago

From @porridge on December 15, 2016 12:58

FTR, https://github.com/kubernetes/ingress/pull/62 fixes the breakage.

bowei commented 6 years ago

From @porridge on December 22, 2016 13:19

The next problem is that the test-e2e target tries to push the image, tracked in https://github.com/kubernetes/ingress/issues/79

bowei commented 6 years ago

From @porridge on December 22, 2016 13:45

After that, and deploying the nginx controller, the problem is that the controller cannot talk to the API server:

I1222 13:19:12.959734       6 launch.go:84] &{NGINX 0.8.4 git-f0762ba git@github.com:porridge/ingress.git}
I1222 13:19:12.961047       6 nginx.go:102] starting NGINX process...
F1222 13:19:12.962171       6 launch.go:109] no service with name default/default-http-backend found: Get http://localhost:8080/api/v1/namespaces/default/services/default-http-backend: dial tcp [::1]:8080: getsockopt: connection refused
bowei commented 6 years ago

From @porridge on December 22, 2016 14:11

That's apparently because /var/run/secrets/kubernetes.io/serviceaccount/token does not exist, so restclient.InClusterConfig() fails, and the config returned by clientConfig.ClientConfig() is wrong.

bowei commented 6 years ago

From @porridge on December 22, 2016 14:55

I hacked launch.go to list the environment, and hardcoded kubeconfig.Host = "10.0.0.1:443"

KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.0.0.1:443
DEFAULT_HTTP_BACKEND_SERVICE_PORT=80
DEFAULT_HTTP_BACKEND_PORT=tcp://10.0.0.90:80
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT_443_TCP_ADDR=10.0.0.1
KUBERNETES_PORT=tcp://10.0.0.1:443
DEFAULT_HTTP_BACKEND_PORT_80_TCP=tcp://10.0.0.90:80
DEFAULT_HTTP_BACKEND_PORT_80_TCP_PORT=80
DEFAULT_HTTP_BACKEND_PORT_80_TCP_ADDR=10.0.0.90
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
DEFAULT_HTTP_BACKEND_SERVICE_HOST=10.0.0.90
DEFAULT_HTTP_BACKEND_PORT_80_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.0.0.1

Yet it still does not work:

F1222 14:51:47.434441       7 launch.go:114] no service with name default/default-http-backend found: Get http://10.0.0.1:443/api/v1/namespaces/default/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection timed out

@bprashanth can you point me to a working setup of this kind somewhere in the k8s ecosystem? https://github.com/kubernetes/community/blob/master/contributors/devel/local-cluster/docker.md starts with a big bold "Stop, use minikube instead".

bowei commented 6 years ago

From @aledbf on December 22, 2016 15:0

@porridge this should work if you set the master address like export KUBERNETES_MASTER=http://172.17.4.99:8080 before launching the controller binary

bowei commented 6 years ago

From @porridge on December 22, 2016 16:12

To summarize our chat, hostNetwork: true on the pod spec seems to solve it.

bowei commented 6 years ago

From @bprashanth on December 22, 2016 17:3

It would be great to not have to run with host networking in the e2e since most people don't in reality, but I think it's fine for a first cut with a TODO.

so to confirm, hostNetwork is just a workaround for setting KUBERNETES_MASTER, which defaults to localhost instead?

bowei commented 6 years ago

From @bprashanth on December 22, 2016 18:6

@porridge do these docs help https://github.com/kubernetes/ingress/blob/master/docs/dev/setup.md ? (try local-up-cluster, or minikube as described, if they've released a version with the nginx addon)

bowei commented 6 years ago

From @porridge on December 27, 2016 20:4

@bprashanth I think it was a bit more complicated:

  1. First of all, the hypercube-based cluster brought up by ingress/hack/e2e-internal/e2e-up.sh was broken in the sense that /var/run/secrets/kubernetes.io/serviceaccount in every pod was an empty directory (no token file, not even the ..* dir/symlink).
  2. this in turn confused the config loading code into thinking it's not running in a cluster, and therefore needs to connect to localhost, which didn't work
  3. I tried to force it to connect to 10.0.0.1:8080 and :443 but that failed with getsockopt: connection timed out
  4. finally, this was worked around with hostNetwork with aledbf's help.

After your suggestion, I tried with the cluster brought up by kubernetes/hack/local-up-cluster.sh just now, but it failed the same way as (3) above:

F1227 19:51:24.140035       6 launch.go:123] no service with name default/default-http-backend found: Get https://10.0.0.1:443/api/v1/namespaces/default/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection timed out

Turns out this (as well as [3]) was caused by my laptop's overzealous firewall.

I haven't tracked down the cause for (1) though.

Perhaps I should try minikube, as I fear trying to teach ufw what to let through will be hard, given that I'm somewhat confused about this networking myself.

bowei commented 6 years ago

@porridge -- the security token issue seems to be related to setting rshared on the mount point for kubelet:

I was able to get the hyperkube containerize kubelet running after doing:

$ mount --bind /var/lib/kubelet /var/lib/kubelet
$ mount --make-rshared /var/lib/kubelet

and passing --volume=/var/lib/kubelet:/var/lib/kubelet:rshared in the docker run command

bowei commented 6 years ago

From @porridge on February 3, 2017 17:55

I made some baby steps towards using minikube. I'm going to investigate that further, given the problems with hyperkube-based cluster (the fact that it seems soft-deprecated, incompatibility with ufw, and the rshared issue on kubelet).

One questions is what I should do with hack/e2e* - since it is currently not much apart from a skeleton around hyperkube setup - should I try extending it to support minikube as well (--deployment)?

bowei commented 6 years ago

@porridge -- if you want an example using hyperkube, you can take a look at this:

https://github.com/kubernetes/dns/tree/master/pkg/e2e

It starts an API server + controller manager in a container. There are some things that need to be resolved with the containerized mounter, but it works...

The kickoff script is here: https://github.com/kubernetes/dns/build/e2e-test.sh

bowei commented 6 years ago

From @porridge on February 4, 2017 12:47

@bowei ingress/hack already contains code to start a hyperkube-based cluster, so I'm not sure we need more examples. However I'm not sure we should be going that way given the problems I mentioned:

Of course I don't know yet how much better minikube is going to be, but the fact that it's more hermetic is promising.

bowei commented 6 years ago

hyperkube has only a dependency on docker, which means we were able to run the e2e tests as part of a travis ci test, which is helpful. The code above does take care of the mounting issue.

bowei commented 6 years ago

From @aledbf on February 4, 2017 19:20

@porridge I think @bowei is right. We should use the code from the dns repo to bootstrap the e2e test suite from a repo that is already using it (so we don't waste time in the setup)

bowei commented 6 years ago

@aledbf if there is common interest, this may be worth splitting off into a common framework. Then we can mutually benefit from the work. Let me know which direction you guys decide to go...

bowei commented 6 years ago

From @aledbf on February 4, 2017 19:43

if there is common interest, this may be worth splitting off into a common framework

Yes please :) I think this is one of the missing pieces, a common bootstrap for e2e test suites

bowei commented 6 years ago

From @aledbf on February 4, 2017 19:43

@bprashanth please comment about this ^^

bowei commented 6 years ago

From @bprashanth on February 6, 2017 18:25

Yeah of course, what are we thinking - a bootstrap hyperkube for travis incubator?

bowei commented 6 years ago

I'm happy to propose the project and drive it

bowei commented 6 years ago

From @bprashanth on February 6, 2017 18:42

Thanks! Suggest an email to kubernetes-dev per https://github.com/kubernetes/community/blob/master/incubator.md#existing-code-in-kubernetes, the test infra team might have some thoughts (or recommend putting it in https://github.com/kubernetes/test-infra or https://github.com/kubernetes/repo-infra)

bowei commented 6 years ago

From @porridge on February 6, 2017 19:39

I'd appreciate it if you could keep me in the loop.

bowei commented 6 years ago

From @porridge on February 24, 2017 10:5

@bowei was there any movement on this?

bowei commented 6 years ago

Sorry about the delay -- will send something post-code freeze...

bowei commented 6 years ago

From @porridge on February 24, 2017 17:7

@beeps sorry it took me this long to test this on the e2e cluster from the main kubernetes repo like you suggested in December, but it does not work either - by the looks of it, for the same reason as with ingress repo's own local cluster case:

$ $GOPATH/k8s.io/kubernetes/cluster/kubectl.sh --namespace=kube-system logs -f nginx-ingress-controller-484927508-skkxg
I0224 17:01:14.408940       7 launch.go:94] &{NGINX 0.9.0-beta.2 git-7013a52 git@github.com:ixdy/kubernetes-ingress.git}
I0224 17:01:14.409029       7 launch.go:97] Watching for ingress class: nginx
I0224 17:01:14.409432       7 nginx.go:112] starting NGINX process...
I0224 17:01:14.410528       7 launch.go:223] Creating API server client for https://10.0.0.1:443
F0224 17:01:29.439727       7 launch.go:111] no service with name kube-system/default-http-backend found: Get https://10.0.0.1:443/api/v1/namespaces/kube-system/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection timed out
bowei commented 6 years ago

From @porridge on March 30, 2017 11:36

Status update: I (temporarily) ditched the attempts to get this running on my laptop, and moved my dev environment onto a workstation, to rule out the interference from the local firewall.

There I was able to successfully play with the nginx controller on a local cluster brought up with k8s.io/kubernetes/hack/local-up-cluster.sh

Starting the same controller on the cluster launched with k8s.io/ingress/hack/e2e.go -v --up --test=false --down=false fails with:

F0330 11:25:55.592077       7 launch.go:251] Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or
 service accounts configuration). Reason: invalid configuration: no configuration has been provided

I think I'm going to start working on some first e2e test cases, using that former cluster for the time being, while waiting for @bowei to start the effort towards common bootstrap (now that 1.6 is released, hint hint).

bowei commented 6 years ago

From @porridge on March 31, 2017 11:6

One interesting problem - worth thinking about when designing this common e2e test infrastructure:

I when trying to import "k8s.io/kubernetes/test/e2e/framework" and use ginkgo at the same time, I found the hard way that the ginkgo imported directly clashed with the vendored copy of ginkgo from the kubernetes repo:

panic: /tmp/ginkgo687931842/test.test flag redefined: ginkgo.seed
goroutine 1 [running]:
flag.(*FlagSet).Var(0xc420018180, 0x3532640, 0x3660aa0, 0xc42027f910, 0xb, 0x23f7307, 0x2a)
        /usr/lib/google-golang/src/flag/flag.go:793 +0x420
flag.(*FlagSet).Int64Var(0xc420018180, 0x3660aa0, 0xc42027f910, 0xb, 0x58de340d, 0x23f7307, 0x2a)
        /usr/lib/google-golang/src/flag/flag.go:618 +0x71
github.com/onsi/ginkgo/config.Flags(0xc420018180, 0xc4207f7e30, 0x7, 0xc420000101)
        /usr/local/google/home/porridge/projects/go/src/github.com/onsi/ginkgo/config/config.go:66 +0xee
github.com/onsi/ginkgo.init.1()
        /usr/local/google/home/porridge/projects/go/src/github.com/onsi/ginkgo/ginkgo_dsl.go:54 +0x59
github.com/onsi/ginkgo.init()
        /usr/local/google/home/porridge/projects/go/src/github.com/onsi/ginkgo/ginkgo_dsl.go:570 +0x9d
k8s.io/ingress/controllers/nginx/test.init()
        /usr/local/google/home/porridge/projects/go/src/k8s.io/ingress/controllers/nginx/test/test_suite_test.go:17 +0x50
main.init()
        k8s.io/ingress/controllers/nginx/test/_test/_testmain.go:46 +0x53

Apparently, this is not because of version difference, but because flags are global, and the two ginkgo packages are treated as unrelated ones, so their (identical) init() functions are both invoked as explained in https://github.com/onsi/ginkgo/issues/234#issuecomment-196645747.

For lack of better ideas, I worked this around for now by removing k8s.io/kubernetes/vendor/github.com/onsi.

bowei commented 6 years ago

From @onsi on April 1, 2017 22:17

srsly the flags package being global is both incredibly convenient and also a painful source of these sorts of rough edges. Not sure how to help @porridge :/ perhaps there's some way for Ginkgo to detect if Ginkgo has already parsed flags? If you could try that out locally until you get it to work and submit a PR that would be great.

bowei commented 6 years ago

From @porridge on April 7, 2017 15:50

@onsi I took a look and:

Please take a look at https://github.com/onsi/ginkgo/compare/master...porridge:flag-tolerant which I think is a reasonable compromise between backwards-compatibility and making is possible for vendored ginkgo to work at all. Obviously it's not the full change, it only shows what needs to be done using one of the flags as an example.

bowei commented 6 years ago

From @stibi on September 13, 2017 9:43

Hi, is the "help wanted" label still valid here? If yes, I'd like to join the ride.

Is this work https://github.com/kubernetes/ingress/pull/1331 also related to this issue?

bowei commented 6 years ago

From @aledbf on September 13, 2017 12:56

Is this work #1331 also related to this issue?

Yes

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale