openshift / api

Canonical location of the OpenShift API definition.
http://www.openshift.org
Apache License 2.0
94 stars 517 forks source link

API-1843: KMSEncryptionProvider Feature Gate #2071

Open swghosh opened 1 month ago

swghosh commented 1 month ago

Feature: KMS Encryption Provider for sensitive etcd Resources

A user-configurable interface to support encryption of data stored in etcd using a supported Key Management Service (KMS).

OpenShift would need to align closer with KMS evolution upstream with respect to the different Kubernetes Encryption Providers available today that can encrypt resources from APIServer EncryptionConfig, https://kubernetes.io/docs/tasks/administer-cluster/kms-provider/.

User Stories:

xref: OCPSTRAT-108, API-1684

openshift-ci[bot] commented 1 month ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci[bot] commented 1 month ago

Hello @swghosh! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

swghosh commented 1 month ago

/test all

swghosh commented 1 month ago

/test all

swghosh commented 1 month ago

/test all

swghosh commented 1 month ago

/test all

openshift-ci-robot commented 1 month ago

@swghosh: This pull request references API-1843 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.18.0" version, but no target version was set.

In response to [this](https://github.com/openshift/api/pull/2071): > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fapi). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
swghosh commented 1 month ago

/retest

JoelSpeed commented 1 month ago

Please add an appropriate PR description to explain the motivation behind this change. Neither the PR, nor the card linked really explain what we are doing here, or why

openshift-ci-robot commented 1 month ago

@swghosh: This pull request references API-1843 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.18.0" version, but no target version was set.

In response to [this](https://github.com/openshift/api/pull/2071): >**Feature**: KMS Encryption Provider for sensitive etcd Resources > >A user-configurable interface to support encryption of data stored in etcd using a supported Key Management Service (KMS). > >OpenShift would need to align closer with KMS evolution upstream with respect to the different Kubernetes Encryption Providers available today that can encrypt resources from APIServer EncryptionConfig, https://kubernetes.io/docs/tasks/administer-cluster/kms-provider/. > >**User Stories**: >- As an OpenShift administrator, I want to encrypt secrets in my cluster at rest using KMS keys so that I can comply with my organization's security requirements. >- As an OpenShift administrator, I want to let the KMS provider to manage the lifecycle of the encryption keys. > >xref: [OCPSTRAT-108](https://issues.redhat.com/browse/OCPSTRAT-108), [API-1684](https://issues.redhat.com/browse/API-1684) > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fapi). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
swghosh commented 1 month ago

@JoelSpeed added description about the feature. also, for context: #2035 Not sure though why some of the e2e(s) are still failing.

swghosh commented 1 month ago

/retest

swghosh commented 4 weeks ago

/retest

JoelSpeed commented 4 weeks ago

According to the upstream documentation, this feature is stable as of 1.29, which would be 4.16.

It is my understanding that if an upstream feature is declared stable, and we have not otherwise specified the gate status, then the feature is already enabled.

This PR appears to disable a feature that has previously been enabled.

/hold

deads2k commented 4 weeks ago

/test ci/prow/verify

openshift-ci[bot] commented 4 weeks ago

@deads2k: The specified target(s) for /test were not found. The following commands are available to trigger required jobs:

The following commands are available to trigger optional jobs:

Use /test all to run the following jobs that were automatically triggered:

In response to [this](https://github.com/openshift/api/pull/2071#issuecomment-2432382138): >/test ci/prow/verify Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
swghosh commented 4 weeks ago

/test verify

swghosh commented 4 weeks ago

This PR appears to disable a feature that has previously been enabled.

Yes, that was the root cause causing those e2e failures err="cannot set feature gate KMSv2 to false, feature is locked to true". KMSv2 feature gate (https://github.com/kubernetes/enhancements/issues/3299) is enabled by default upstream and even HyperShift uses it today so we do not have much gating to do on the upstream side. Today, in OpenShift we do not have the supported encryption providers to use the KMS v2 behaviour in downstream. Renaming the feature gate to KMSEncryptionProvider so we could gate our downstream implementation of the EncryptionConfig(s) through TP.

Thanks @JoelSpeed!

xref: https://redhat-internal.slack.com/archives/C06DXMRLXQT/p1729766294150549

JoelSpeed commented 4 weeks ago

Ok, I'm happy with this change, but before we merge, I'd like to see some initial reviews on the EP so that we are confident this is agreeable enough within the team and other stakeholders

@tkashem, once you are happy enough with the EP that the implementation can start, please LGTM here and ping me on slack and I'll get the approval on

JoelSpeed commented 2 weeks ago

PR is good to go, but we should get the EP into a state where the first round of reviews are in before we merge the gate, if we don't have at least some consensus, merging this would add a gate that is unneeded

swghosh commented 2 weeks ago

/test all

dgrisonnet commented 1 day ago

/retest

tkashem commented 15 hours ago

@tkashem, once you are happy enough with the EP that the implementation can start, please LGTM here and ping me on slack and I'll get the approval on

this looks good to me for tech-preview /lgtm

tkashem commented 15 hours ago

/label acknowledge-critical-fixes-only

swghosh commented 15 hours ago

thanks @tkashem /remove-hold off to you @JoelSpeed

JoelSpeed commented 14 hours ago

/retest

Looks good, but I'd like to see if these test failures are related

JoelSpeed commented 13 hours ago

/lgtm

PR is inert, I don't believe the tests failures are related and relevant components (eg KAS) don't recognise the gate, so we've got past my concern about overlapping gates

openshift-ci[bot] commented 13 hours ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed, swghosh, tkashem

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/api/blob/master/OWNERS)~~ [JoelSpeed] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 13 hours ago

/retest-required

Remaining retests: 0 against base HEAD c98acdb0b66c1ba022db2bd79b7ac40316f61a3e and 2 for PR HEAD 1bc23b5cc8b05d2ff1bbda2215c7b41c9f1e8ed3 in total

JoelSpeed commented 13 hours ago

/hold It sounds like this feature might need a little more time to bake, I'd like to hold this until the API is ready to go and the EP has at least had some review.

From discussion with @wallylewis we don't need to rush this into 4.18 so lets wait until after branch cut

JoelSpeed commented 13 hours ago

/hold

openshift-ci[bot] commented 9 hours ago

@swghosh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 1bc23b5cc8b05d2ff1bbda2215c7b41c9f1e8ed3 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-azure 1bc23b5cc8b05d2ff1bbda2215c7b41c9f1e8ed3 link false /test e2e-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).