bookingcom / shipper

Kubernetes native multi-cluster canary or blue-green rollouts using Helm
Apache License 2.0
734 stars 38 forks source link

Add additional Kubernetes service that is routing the traffic to the contender only (the new version of the service) #400

Open dmeytin opened 3 years ago

dmeytin commented 3 years ago

New Feature Request

The Problem Statement:

When strategy contains step with 0 traffic re-routing, and we still need to validate the new instance using in-cluster validation job, it's tough to resolve the new instance's ip for sending direct requests.

Example:


steps:
      - capacity:
          contender: 1
          incumbent: 100
        name: staging
        traffic:
          contender: 0
          incumbent: 100

Having the definition above, the client's requests to the original service will not be routed to the new instance. However, we would like to validate that new instance behaves properly by executing the functional tests using a test job running at the same cluster. For this purpose we need to have a dedicated service that is routing the traffic to the new instances only.

This service could be removed (optionally) once the traffic is fully routed to the contender. The service name could be in the following format ${ORIGINAL-SVC-NAME}-{ACHIEVED STEP}

kanatohodets commented 3 years ago

Hey! Good use case. At Booking this was solved by adding a "staging" service object to the Helm chart. This service object has a selector for the contender shipper-release label.

This works by ensuring that the service object has a stable name per application. This causes Shipper to overwrite the spec to update the selector to point to the new contender, when it is first installed. The convention at Booking is to use a stable, well known URL for a given application's staging service, and point test jobs at that URL.

Does that make sense? I wrapped up at Booking a little while ago, so I no longer have access to the charts to provide a more specific example. Perhaps @zoidbergwill can help on that front!

zoidyzoidzoid commented 3 years ago

I definitely can!

dmeytin commented 3 years ago

Will be much appreciated!

zoidyzoidzoid commented 3 years ago

So what you can do is have multiple Service objects, instead of only one, and then Shipper will adjust traffic on the one Service object that has the shipper-lb: production label, which will be your normal app URL, e.g. service.com

https://github.com/bookingcom/shipper/blob/4d17f565d74419dab7d378184f5cdd862a696a67/docs/limitations.rst#services

So we have a second Service object that has some staging DNS name, like staging-service.com, so we can route testing jobs and things to that instead.

zoidyzoidzoid commented 3 years ago

How we configure DNS and external routing is quite customised, and not using any of the common off the shelf Kubernetes ingress controllers, so I'm not sure how that'd work with whatever you're using, but I can definitely try help more if you get stuck after adding an additional Service.