GoogleCloudPlatform / k8s-multicluster-ingress

kubemci: Command line tool to configure L7 load balancers using multiple kubernetes clusters
Apache License 2.0
376 stars 68 forks source link

Add an ingress conformance test for kubemci #131

Closed nikhiljindal closed 6 years ago

nikhiljindal commented 6 years ago

Filing this issue to track the work for adding an e2e test for testing kubemci's conformance with ingress spec.

cc @G-Harmon @MrHohn @csbell

nikhiljindal commented 6 years ago

PR to add the test: https://github.com/kubernetes/kubernetes/pull/59234 Issue to provide kubemci to k/k e2e tests: https://github.com/kubernetes/test-infra/issues/6624

nikhiljindal commented 6 years ago

With https://github.com/kubernetes/test-infra/pull/6799, we now have a kubemci-image-push job that runs each time there is a new commit in this repository. The runs can be tracked at https://k8s-testgrid.appspot.com/sig-multicluster-kubemci#kubemci-image-push.

Once https://github.com/kubernetes/test-infra/pull/6813 merges, we will have a ci-kubemci-ingress-conformance job that runs every 60 mins and runs the tests that were added in kubernetes/kubernetes#59234 using the latest kubemci image pushed by kubemci-image-push job

nikhiljindal commented 6 years ago

https://k8s-testgrid.appspot.com/sig-multicluster-kubemci#kubemci-ingress-conformance is now running every 60 mins.

It is still failing though

nikhiljindal commented 6 years ago

Found out that the test is failing due to timing out in getting IP address for the ingress. We were silently ignoring the error rather than explicitly failing with the right error message. Sent https://github.com/kubernetes/kubernetes/pull/61234 to fail with the right error message.

nikhiljindal commented 6 years ago

Found a couple of issues by running the tests locally. Sent https://github.com/kubernetes/kubernetes/pull/61379 to fix them all!

nikhiljindal commented 6 years ago

Job is still failing.

There seem to be 2 issues:

MrHohn commented 6 years ago
  • Create is also failing with IP address already in use. Need to debug that.

Reproduced this while trying to port couple e2e tests to kubemci. Seems like kubecmi tried to assign the same static IP (pre-reserved) to both the normal ingress and the multi-cluster ingress.

URL Map mci1-um--pre-shared-cert created successfully
Ensuring ssl cert
Ensuring http target proxy.
Ensuring target https proxy
Creating target HTTPS proxy mci1-tps--pre-shared-cert
Creating target https proxy mci1-tps--pre-shared-cert
Target https proxy mci1-tps--pre-shared-cert created successfully
Ensuring https forwarding rule
Creating forwarding rule mci1-fws--pre-shared-cert
Error ensuring https forwarding rule: googleapi: Error 400: Invalid value for field 'resource.IPAddress': 'X.X.X.X'. Specified IP address is in-use and would result in a conflict., invalid
Ensuring firewall rule
Creating firewall rule mci1-fr--pre-shared-cert
Firewall rule mci1-fr--pre-shared-cert created successfully

I can see two L7 LBs were created. Is that as expected?

k8s-um-e2e-tests-ingress-72zw8-pre-shared-cert--fd27e948a649f10 HTTPS    
1 backend service (1 instance group)

mci1-um--pre-shared-cert    HTTPS    
1 backend service (1 instance group)

mci.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    ingress.gcp.kubernetes.io/pre-shared-cert: test-pre-shared-cert
    kubernetes.io/ingress.allow-http: "false"
    kubernetes.io/ingress.class: gce-multi-cluster
    kubernetes.io/ingress.global-static-ip-name: kubemci-8f62f84e-32f5-11e8-9b00-480fcf446ae4
  creationTimestamp: null
  name: pre-shared-cert
  namespace: e2e-tests-ingress-72zw8
spec:
  backend:
    serviceName: echoheaders-https
    servicePort: 80
  rules:
  - host: test.ingress.com
    http:
      paths:
      - backend:
          serviceName: echoheaders-https
          servicePort: 80
        path: /test
status:
  loadBalancer: {}
nikhiljindal commented 6 years ago

No that is not expected. For an ingress with gce-multi-cluster class, ingress-gce should only create the instance group. It should not create the whole load balancer (backend service, url map, forwarding rule, etc).

I wonder if this is a recent regression in ingress-gce. I had verified that the test was passing with https://github.com/kubernetes/kubernetes/pull/61379 on a 1.8.8-gke.0 GKE cluster. @MrHohn What cluster version (or specifically glbc version) are you using in testing?

cc @csbell @nicksardo @bowei

MrHohn commented 6 years ago

What cluster version (or specifically glbc version) are you using in testing?

I am using k8s.gcr.io/ingress-gce-glbc-amd64:1.0.0 on an E2E cluster at HEAD (built 2 days ago).

nikhiljindal commented 6 years ago

Great. I will try to run the test with a HEAD cluster. In the meantime, if you get a chance you can try running the test with an earlier version of glbc and see if the test passes.

nikhiljindal commented 6 years ago

Found this to be a regression in glbc controller. Tracking the fix in https://github.com/kubernetes/ingress-gce/issues/182

nikhiljindal commented 6 years ago

While the ingress regression (https://github.com/kubernetes/ingress-gce/issues/182) is now fixed, our job is still failing. Turns out that we are not using the latest glbc image from head. Sent https://github.com/kubernetes/test-infra/pull/7503 to fix that.

Found another issue in the logs where the test was crashing with assignment to entry in nil map. Sent https://github.com/kubernetes/kubernetes/pull/61988 to fix that.

Expecting the tests to turn green, once both those PRs merge.

nikhiljindal commented 6 years ago

The test is now failing due to leaking instance groups. Sent https://github.com/GoogleCloudPlatform/k8s-multicluster-ingress/pull/169 to fix that

G-Harmon commented 6 years ago

Nikhil has kubernetes/kubernetes#62285 out for review to address an with "kubemci remove" in the test.

nikhiljindal commented 6 years ago

The tests have now been passing for past few days: https://k8s-testgrid.appspot.com/sig-multicluster-kubemci#kubemci-ingress-conformance