operator-framework / operator-controller

A new and improved management framework for extending Kubernetes with Operators
https://operator-framework.github.io/operator-controller/
Apache License 2.0
61 stars 53 forks source link

:seedling: Add support for CA/certificate rotation #1062

Closed tmshort closed 2 months ago

tmshort commented 2 months ago

Fixes #915 Mounted secrets are automatically updated into pods, but...

So, update the certificate volume patch, which requires a change in how we look for certificates in the CA cert directory.

Add a watch, so when the certs do change, we update the cert pool. Also look at validity dates of certificates, and error on expired certs.

The default cert-manager certificates have 90 days validities.

Description

Reviewer Checklist

netlify[bot] commented 2 months ago

Deploy Preview for olmv1 ready!

Name Link
Latest commit 5eea0725abebc3df0430037d7a384a65be7762b2
Latest deploy log https://app.netlify.com/sites/olmv1/deploys/669ac31547b423000860d4e7
Deploy Preview https://deploy-preview-1062--olmv1.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

tmshort commented 2 months ago

This also adds a unit-test for an empty cert dir, and for an expired certificate. And also adds an e2e to make sure secret updates and rotations actually work.

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 71.28713% with 29 lines in your changes missing coverage. Please review.

Project coverage is 72.67%. Comparing base (775613f) to head (5eea072). Report is 2 commits behind head on main.

Files Patch % Lines
internal/httputil/certpoolwatcher.go 68.96% 11 Missing and 7 partials :warning:
internal/httputil/certutil.go 78.26% 2 Missing and 3 partials :warning:
internal/catalogmetadata/cache/cache.go 66.66% 1 Missing and 1 partial :warning:
internal/httputil/httputil.go 60.00% 1 Missing and 1 partial :warning:
internal/rukpak/source/image_registry.go 60.00% 1 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1062 +/- ## ========================================== - Coverage 72.85% 72.67% -0.19% ========================================== Files 31 32 +1 Lines 1864 1965 +101 ========================================== + Hits 1358 1428 +70 - Misses 371 388 +17 - Partials 135 149 +14 ``` | [Flag](https://app.codecov.io/gh/operator-framework/operator-controller/pull/1062/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework) | Coverage Δ | | |---|---|---| | [e2e](https://app.codecov.io/gh/operator-framework/operator-controller/pull/1062/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework) | `55.65% <46.53%> (+0.13%)` | :arrow_up: | | [unit](https://app.codecov.io/gh/operator-framework/operator-controller/pull/1062/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework) | `45.54% <61.38%> (+0.53%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=operator-framework#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tmshort commented 2 months ago

TLDR: it looks like there is no equivalent of GetClientCertificate/GetCertificate, but for root CAs in TLS config in standard library, but it is possible to implement a workaround with VerifyPeerCertificate. This would require setting InsecureSkipVerify to true and re-implementing the standard verification in VerifyPeerCertificate as far as I understand - see this example from Go docs.

And we really don't want to do that (set InsecureSkipVerify) by default. And reimplementing the verification code is probably not worth it, when we can just offer up the updated certificate pool on demand.

m1kola commented 2 months ago

And we really don't want to do that (set InsecureSkipVerify) by default. And reimplementing the verification code is probably not worth it, when we can just offer up the updated certificate pool on demand.

Setting InsecureSkipVerify to true and VerifyPeerCertificate means that we take over the responsibility of verifying the certs from standard library. That's not great, but it is not a lot of code (see standard library code, and client example from the docs). It doesn't sounds too terrible to me, so I'm open to consider this.

As far as I understand in this implementation we are re-creating a client each time and will have to establish a connection each we use it which is not optimal.

I don't know how significantly it will affect our performance. One one hand - creating new connections every time is not great, but on the other hand - not sure if we are going to benefit from re-using the connection in these use cases.