kyma-project / lifecycle-manager

Controller that manages the lifecycle of Kyma Modules in your cluster.
http://kyma-project.io
Apache License 2.0
10 stars 30 forks source link

feat: [Zero-Downtime] Runtime-Watcher TLS configuration management #1507

Open Tomasz-Smelcerz-SAP opened 5 months ago

Tomasz-Smelcerz-SAP commented 5 months ago

Description

Zero-Downtime: Implement the Runtime-Watcher TLS configuration renewal logic.

The logic is based on the POC results, but it is simplified: The additional secret object is removed, instead the Istio Gateway secret plays the key role in the migration process. This is based on the following observation: We must adjust SKR watcher client TLS configuration if and ONLY IF the Istio Gateway TLS configuration changes This has an impact for the design of the zero-downtime certificate rotation solution. The system is designed with two independent components, running asynchronously to each other:

Note: This issue describes the second component

Responsibility 1: Bootstrap

  1. No Watcher TLS secret exists in the KCP
  2. Wait until the Istio Gateway secret is available in the KCP
  3. Create Watcher TLS certificate in the KCP (using certificate CR - the Cert Manager creates the secret)

Responsibility 2: Migration

  1. When both of the following happen:
    • Root certificate is more recent than the Watcher TLS secret in the KCP
    • Istio-Gateway secret is more recent than the Watcher TLS secret in the KCP
  2. Re-generate the Watcher TLS certificate in the KCP (already implemented but triggerred differently)

Responsibility 3: Synchronization

  1. When any of the conditions occur:
    • Watcher TLS configuration is missing in SKR
    • Watcher TLS secret in KCP is more recent than the corresponding secret in the SKR
    • Istio-Gateway secret is more recent than secret in the SKR
  2. Then generate Watcher TLS configuration secret in the SKR, taking the tls.crt and tls.key from the corresponding secret in the KCP, but ca.crt data from the Istio-Gateway secret

Note: Instead of 1. we can also just sync the data with every reconciliation (patch)

Implementation Notes:

Reasons

We need a robust, zero-downtime solution for the Watcher TLS certificate rotation

Acceptance Criteria

Feature Testing

Testing approach

unit tests, integration tests, e2e test(s) Existing tests:

Attachments

watcher-certificate-migration3

Related Issues

https://github.com/kyma-project/lifecycle-manager/issues/1430

LeelaChacha commented 2 weeks ago

Blocked by: https://github.com/kyma-project/lifecycle-manager/issues/1890