Open jpeach opened 4 years ago
Note that this doesn't solve configuration snapshot consistency issues (see various issues in the go-control-plane repo).
@stevesloka what do we have left to do here? I guess we need to let the go-control-plane impl bake for awhile, and maybe do some perf/load testing of it before sunsetting the contour impl?
Yup all of that. We should also update the feature tests to use this along with the integration tests. There was some work on versioning separate caches here that needs finished up: https://github.com/projectcontour/contour/pull/2917
Action items:
Not blocking for this issue, but worth doing:
Currently using the envoy
xDS server in a daily build in CI to get confidence before we switch over, at least for SOTW mode
still been flaky in CI e2e tests, trying to run tests locally to discover sources, particularly often Envoys are slower to become healthy
Adding to 1.26.0 milestone to make some forward progress on resolving issues.
Bumping to 1.27
@davinci26 @clayton-gonsalves @izturn do any of you have non-production environments where you could try switching to using the go-control-plane xDS server instead of the legacy Contour impl and see if you encounter any problems? We've been running E2E's daily with it enabled with success but some more real-world testing (ideally looking at performance/scale in addition to correctness) would be great too before we consider flipping the default in Contour.
Specifically, this involves setting the following in the Contour config file:
server:
xds-server-type: envoy
For reference here is a PR that changes the default to be envoy
: https://github.com/projectcontour/contour/pull/6146
@davinci26 @clayton-gonsalves @izturn (or anyone else) just a gentle nudge here, is this change something you could test in a non-prod environment?
@skriss sorry had this message on draft.
We are working on a bunch of items to improve the operational stability of Contour so we are not taking many upstream changes but I think we should be able to take it and test it out in a couple of weeks from now.
Does this work?
@skriss sorry had this message on draft.
We are working on a bunch of items to improve the operational stability of Contour so we are not taking many upstream changes but I think we should be able to take it and test it out in a couple of weeks from now.
Does this work?
That'd be great, thanks! We may make the change upstream soon-ish anyway to let CI start running regularly on it. It has already been running in our nightly tests and seems pretty stable.
selfnote: consider effects of Endpoint updates
@skriss, we have some non-prod environments, but we don't put a lot of payloads on them, we will try it later
@skriss Based on our limited testing, everything is fine
Remaining work here is to fully remove the Contour xDS server option and implementation, can plan to do this for the 1.31 release assuming no major issues post-1.29. release.
I looked ad go-control-plane a bit and we ought to be able to use it to replace our custom xDS code. The interfaces are a bit different, but we should be able to bind the DAG in without a lot of trouble. This likely gives us ADS support for free.
Related #1286