projectcontour / contour

Contour is a Kubernetes ingress controller using Envoy proxy.
https://projectcontour.io
Apache License 2.0
3.7k stars 672 forks source link

Add metrics to indicate when Contour is connecting to an unsupported Envoy #995

Open stevesloka opened 5 years ago

stevesloka commented 5 years ago

Describe the solution you'd like We've run into several users who run on master (which isn't a good supported idea), but also get versions of Contour<==>Envoy mixed up when upgrading. Sometimes this is ok, but other times a specific version of Contour will only support a specific version of Envoy leaving users with cryptic failure messages in Envoy's logs.

It would be neat if when Envoy started it could check with Contour that the versions are supported, if not Envoy should block on starting. I think if this was implemented in the readiness probe, should someone do an upgrade to Contour without upgrading Envoy, the envoy container would block a rollout since it would fail the probe and shouldn't affect normal traffic.

davecheney commented 5 years ago

Thank you for raising this issue. At the moment Contour and Envoy are in lock step, you cannot use a version older than what Contour expects and you shouldn't use a version newer than what Contour expects.

This could perhaps be mitigated if we find a solution to #952

I'm not sure how to make Envoy check the contour version, xDS doesn't have an notion of a server version.

stevesloka commented 5 years ago

I haven't looked too much into how to implement, but I think we'd need another container to be the healthchecker container. It can look at envoy's version which is available in its admin page. Contour would need to expose a /version or something to let this healthcheck container query for.

This would need more thought to make it better, the above solution is just a quick thought.

davecheney commented 5 years ago

@stevesloka do you think this is possible before Contour 1.0? If so, could you please move it to the appropriate milestone. If we can live without this til after 1.0, please move it to the unplanned milestone and I'll revisit it when we get closer to planning a 1.1 release.

stevesloka commented 5 years ago

@davecheney I'm going to stick to design first to determine how to progress, added to v0.15.0 milestone.

davecheney commented 5 years ago

Thanks @stevesloka. Let's talk more about this when we get to 0.15. I'm not sure if propogating the envoy admin page is going to work in all deployment scenarios; i'm thinking about what happens when envoy and contour aren't in the same pod. But we might be able to set up a special listener on envoy and ask it to return it's full Server: string -- something we normally suppress. That might be a way of getting the version number without having to expose the admin interface.

Let's talk more in July.

davecheney commented 5 years ago

@stevesloka do you have any suggestions on how we could implement this? If nothing comes to mind would you consider moving this to the backlog milestone and we'll revisit after Contour 1.0

stevesloka commented 5 years ago

@davecheney I'm going to backlog this for now.

davecheney commented 5 years ago

Thanks. Lets revisit it in November

On Thu, 29 Aug 2019 at 23:07, Steve Sloka notifications@github.com wrote:

@davecheney https://github.com/davecheney I'm going to backlog this for now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/heptio/contour/issues/995?email_source=notifications&email_token=AAABYA5TYULEPJOFD6QIPETQG7CZLA5CNFSM4HENN4IKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5ONDTY#issuecomment-526176719, or mute the thread https://github.com/notifications/unsubscribe-auth/AAABYA6YAWJH5NKS3XR5C2DQG7CZLANCNFSM4HENN4IA .

xaleeks commented 3 years ago

Bringing this back now, can we implement a controller in our operator for checking compatibility? Imagine it would be useful for when upgrading through the operator

youngnick commented 3 years ago

If there is a simple way for Contour to check that Envoys that connect to it are the supported version, that would be great.

We would only be able to log or increment a metric or something though, or else you would never be able to upgrade. But if there was a metric, the operator could check that the Envoy version is supported as part of its readiness checks.

xaleeks commented 3 years ago

Sounds good, we’ll leave it in parking lot1 if someone wants to pick this up

github-actions[bot] commented 2 weeks ago

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

You can:

Please send feedback to the #contour channel in the Kubernetes Slack