gardener / garden-setup

Describes Gardener components for installation of a Gardener landscape using sow
Other
78 stars 55 forks source link

[ci:component:github.com/gardener/gardener:v1.74.2->v1.84.0] #1263

Closed gardener-robot-ci-3 closed 11 months ago

gardener-robot-ci-3 commented 11 months ago

Release Notes:

The `MachineControllerManagerDeployment` has been promoted to GA and is now locked to "enabled by default". Make sure that all registered provider extensions support this feature gate before upgrading to this version of Gardener.
It is possible to delete a Shoot even if `shoot.gardener.cloud/ignore` annotation is set to true.
Control plane components `kube-apiserver`, `kube-controller-manager` and `kube-scheduler` now mount `key` files with `DefaultMode` set to `416`(`0640` permissions).
`maxSurge` for `kube-apiserver` and `gardener-apiserver` of the virtual garden cluster is set to `100%`.
The following image is updated:
- `quay.io/prometheus/prometheus`: `v2.43.1` -> `v2.47.0`
The `kube-apiserver` no longer mounts root CA bundles from the underlying host.
An issue causing nil pointer panic on scaleup of the machinedeployment along with trigger of rolling update, is fixed
The `ResourcesProgressing` condition appearing in the status of `ManagedResource`s now checks for non-terminated `Pod`s before reporting `status=False`.
Usage of the deprecated injection mechanisms in controller-runtime (like `InjectScheme`, `InjectLogger`, `InjectConfig`, `InjectClient`, `InjectCache` etc) as well as package `extensions/pkg/controller/common` are dropped in a preparation to upgrade to the next version where injection is removed entirely. With this, `Inject*` functions on controllers, predicates, actuators, delegates, and friends are not called anymore. When upgrading the `gardener/gardener` dependency to this version, all injection implementations need to be removed. As a replacement, you can get the needed clients and similar from the manager during initialisation of the component.
Two additional labels `worker.gardener.cloud/image-name` and `worker.gardener.cloud/image-version` are attached to worker nodes to identify which operating system they are running. This can then be used in selectors that target only workers with a specific operating system and is helpful for e.g. driver deployment.
The `kube-controller-manager` controllers are now disabled based on disabled APIs, which can be configured with `spec.kubernetes.kubeAPIServer.runtimeConfig` field in the Shoot API. All controllers are enabled by default for Shoot with workers. For workerless Shoots, some non-required APIs are disabled by default, which can be overridden by the above configuration.
Configure the value for the flag `metrics-scrape-wait-duration` for compaction controller to set a wait duration at the end of every compaction job, to allow for metrics to be scraped by a Prometheus instance.
During the `restore` phase of control plane migration, the `machine-controller-manager` is deployed with 0 replicas if it did not exist before or if it existed and was not scaled up yet. This fixes an issue that could cause the `Shoot`'s nodes to get recreated during control plane migration.
An issue has been fixed for highly-available `Shoot`s whose `etcd` clusters didn't get ready in the `Completing` phase of a CA credentials rotation.
Refactored `statefulset`, `service`, `poddisruptionbudget`, `lease`, and `configmap` components to use default labels and owner references from `etcd`.
A bug has been fixed that prevented users without permissions to list `CustomResourceDefinition`s from interacting with the Gardener APIs when using a `kubectl` version lower than `1.27`.
The `github.com/golang/mock/gomock` dependency is replaced by `go.uber.org/mock`.
New metrics introduced: 
- api_request_duration_seconds -> tracks time taken for successful invocation of provider APIs. This metric can be filtered by provider and service.
- driver_request_duration_seconds -> tracks total time taken to successfully complete driver method invocation. This metric can be filtered by provider and operation.
- driver_requests_failed_total -> records total number of failed driver API requests. This metric can be filtered by provider, operations and error_code.
Update gardener/gardener to 1.77.1.
Upgraded `etcd-backup-restore` from `v0.24.3` to `v0.24.6` for `etcd-custom-image`, and from `v0.25.1` to `v0.26.0` for `etcd-wrapper`
Change port of ssh reverse tunnel to 443
Support for `nip.io` shoot domains is discontinued.
The logging components: vali and valitail are now updated to v2.2.8.
Added new option to `./hack/generate-controller-registration.sh` script `[-e, --pod-security-enforce[=pod-security-standard]` which sets the `security.gardener.cloud/pod-security-enforce` annotation of the generated `ControllerRegistration`. When not set this option defaults to `baseline`.
A validation rule was added that forbids changing the primary DNS provider in `.spec.dns.providers` as soon as the shoot was scheduled.
`maintenance-controller` now disables `PodSecurityPolicy` admission controller when forcefully upgrading the Kubernetes version of a `Shoot` to `v1.25`. It also ensures maximum workers of each for group is greater or equal to its number of zone for forceful upgrades to `v1.27`.
Bump g/g version to remove stale client-go dependency
`nginx-ingress-controller` image is updated to `v1.9.3`.
An issue has been fixed that prevented setting the `UnauthenticatedHTTP2DOSMitigation` feature gate.
`kubectl get garden` now features additional printer column `Observability` providing information about the Observability components of the runtime cluster.
`hack/generate.sh` has been renamed to `hack/generate-sequential.sh`.
Before upgrading to this Gardener versions, you must make sure that the `Service`s of all registered provider extensions serving webhooks for the shoot cluster are annotated with `networking.resources.gardener.cloud/from-all-webhook-targets-allowed-ports=[{"protocol":"TCP","port":<port>}]`, `networking.resources.gardener.cloud/namespace-selectors=[{"matchLabels":{"gardener.cloud/role":"shoot"}}]`, and `networking.resources.gardener.cloud/pod-label-selector-namespace-alias=extensions`.
If you are using `provider-extension` setup you should adapt your files in `example/provider-extensions/garden/controlplane` because `default-domain` and `internal-domain` secrets are removed from `gardener-controlplane` Helm chart.
The following image is updated:
- `quay.io/prometheus/alertmanager`: `v0.24.0` -> `v0.26.0`
`AllMembersReady` condition has now been fixed to eventually show the correct overall readiness of an etcd cluster.
While scaling up a non-HA etcd cluster to HA skipping the scale-up checks for first member of etcd cluster as first member can never be a part of scale-up scenarios.
A new make target is introduced to add license headers.
Prepare shared `component_descriptor` script for migration from GCR to Artifact Registry.
The `DisableScalingClassesForShoots` feature gates has been promoted to GA (and is now always enabled).
An issue has been fixed which was causing a broken `ControlPlaneHealthy` condition report for `Shoot`s when the `MachineControllerManagerDeployment` feature gate gets enabled until their next reconciliation.
Update `vertical-pod-autoscaler` to `v0.14.0`.
Etcd-backup-restore now uses the user home directory to create files.
The `Worker` state reconciler has been dropped, i.e., updated provider extensions will no longer populate the machine state to the `.status.state` field of `Worker` resources. For a few releases, `gardenlet` will no longer persist any still existing data in the `.status.state` field of `Worker` resources during a control plane migration of a `Shoot`, and it will set `.status.state` to `nil` after a successful reconciliation or restore operation.
Gardener now uses 3072 bit RSA keys in order to generate TLS certificates.
Update alpine base image components to 3.18.3.
Package `pkg/utils/managedresources` now works with immutable secrets for managed resources under the hood. Existing secrets will be marked for garbage collection and replaced with immutable ones during the first reconciliation of the managed resource.
Added `errorCode` field in the `LastOperation` struct. This should be implemented only for the `CreateMachine` call in the `triggerCreationFlow`. This field will be utilized by Cluster autoscaler to do early backoff 
Gardener Scheduler's Minimal Distance strategy can take scheduling decisions based on region distances configured by operators. This especially improves the allocation for shoots of providers regions for which the standard Levenshtein distance is inappropriate. Please see `docs/concepts/scheduler.md` for more information.
The `hack/generate-crds.sh` script now receives the file name prefix via the `-p` option (previously, the prefix was the first argument to the script).
The `hack/check-docforge.sh` script is now removed. The repo based manifest are removed in favor of a centrally managed manifests. See https://github.com/gardener/documentation/issues/431. The manifests are now maintained centrally in https://github.com/gardener/documentation/tree/master/.docforge.
The following dependencies are updated:
- `k8s.io/*` : `v0.26.4` -> `v0.27.5`
- `sigs.k8s.io/controller-runtime`: `v0.14.6` -> `v0.15.2`
The registry of the prometheus-operator image is switched from ghcr (`ghcr.io/prometheus-operator/prometheus-config-reloader`) to `quay.io` (`quay.io/prometheus-operator/prometheus-config-reloader`) because the ghcr does not support image pulls over IPv6.
Extensions have to implement the `ForceDelete` function in the actuator with the logic of forcefully deleting all the resources deployed by them.
The `virtual-garden-kube-apiserver` service (for the `virtual-garden` cluster) was switched from type `LoadBalancer` to `ClusterIP`. Please make sure to migrate all DNS records from the `virtual-garden-kube-apiserver` to the `istio-ingressgateway` endpoint before upgrading to this Gardener version.
Gardener base image is updated to `gcr.io/distroless/static-debian12:nonroot`.
When scaling from single-node to multi-node etcd cluster, Etcd Druid will now first ensure that any change to the peer URL (e.g TLS enablement)  is seen by the existing etcd process running within the etcd member pod. Once that is confirmed then it will scale up the Etcd StatefulSet and add relevant annotations.
Upgraded Ginkgo v1 to v2 and updated other dependencies
Fixed a possibility for the `migrate` phase of control plane migration to become permanently stuck if the shoot was created when the `MachineControllerManagerDeployment` feature gate is disabled, control plane migration is triggered for the shoot and the feature gate is enabled during the migration phase.
Force drain and delete volume attachments for nodes un-healthy due to `ReadOnlyFileSystem` and `NotReady` for too long
`gardener-operator` is now managing the `nginx-ingress-controller` and `nginx-ingress-k8s-backend` components. Make sure that your `Garden` resource specifies the [`.spec.runtimeCluster.ingress` section](https://github.com/gardener/gardener/blob/ee3dd5d177be1bf3435534f194e25cef67177650/example/operator/20-garden.yaml#L16-L22).
extension library: State update for a Worker object can be now skipped by annotating it with `worker.gardener.cloud/skip-state-update=true`.
Makefile has been updated to use `Skaffold` for deploying `etcd-druid` with the `make deploy` target, simplifying the deployment process and eliminating the need to push the image to the container registry for each local development testing.
Stability of the ssh tunnel in the local extension setup should improve due to better failure handling.
Add CVE categorization for etcd-druid.
The `MachineClassKind()`, `MachineClass()`, and `MachineClassList()` methods have been dropped from the generic `Worker` actuator's interface and do not need to be implemented anymore.
It is now possible to enable disabled APIs for workerless shoot clusters via `spec.kubernetes.kubeAPIServer.runtimeConfig`.
Gardener refined the scope of the problematic webhook matcher for `endpoint` objects. Earlier, shoot clusters were assigned a constraint reporting a problem with a `failurePolocy: Fail` webhook acting on these objects. Now, only `endpoint`s in the `kube-system` and `defaults` namespaces are considered for this check.
The regression is now fixed and the control plane logs shall be visible in the Plutono dashboards.
Etcd druid will now not support `policy/v1beta1` for `PodDisruptionBudget`s and will only use `policy/v1` for `PodDisruptionBudget`s
Remove unneeded Monitor function from iptables implementation 
A bug where MCM removed a machine other than the one , CA wanted , is resolved.
`machinepriority.machine.sapcloud.io` annotation on machine is now reset to 3 by autoscaler if the corresponding node doesn't have `ToBeDeletedByClusterAutoscaler` taint
If the `kubeletCSRApprover` controller is enabled, it is now mandatory to specify the namespace in the source cluster in which the `Machine` resources reside via `.controllers.kubeletCSRApprover.machineNamespace`.
Included `UnavailableReplicas` in determining if a machine deployment status update is needed
It is no longer possible to configure `.spec.virtualCluster.kubernetes.kubeAPIServer.authorization` in the `Garden` API.
Removes `node.machine.sapcloud.io/not-managed-by-mcm` annotation from nodes managed by the MCM.
The plutono dashboards are now verified as part of `make check`.
Update golang image in verify step to 1.21.3.
During the `Migrate` phase of a control plane migration of a `Shoot`, the state is now only persisted after all extension resources have been migrated. Consequently, make sure that you have added all state to the `.status.state` field of the respective extension object when running `Migrate()`.
status.Status now captures underline cause, allowing consumers to introspect the error returned by the provider. WrapError() function could be used to wrap the provider error
`NewClientForShoot` creates a client with a rest mapper using `LazyDiscovery`.
Introduce DEP-04 [EtcdMember Custom Resource](https://github.com/gardener/etcd-druid/blob/master/docs/proposals/04-etcd-member-custom-resource.md).
The credentials (CA) rotation has been made more robust. In some cases, the `Shoot` reconciliation stuck at `Deploying main and events etcd` when the rotation was in `Preparing` phase.
Enabled the `node-exporter`'s  [textfile collector](https://github.com/prometheus/node_exporter#textfile-collector). It will parse files matching the `*.prom` glob in the `/var/lib/node-exporter/textfile-collector` directory and load metrics from them so that they can be scraped by prometheus.
`gardener-operator` now renews garden access secrets and the gardenlet kubeconfig on all `Seed`s during CA/service account signing key credentials rotation.
Shoot control plane prometheus is now scraping kubelet volume metrics (`kubelet_volume_stats_available_bytes`, `kubelet_volume_stats_capacity_bytes` and `kubelet_volume_stats_used_bytes`) from the kube-system namespace. This allows Gardener extensions deploying PVCs to the Shoot's kube-system namespace (such as the registry-cache extension) to build alerting and plutono dashboard panels using these kubelet volume metrics.
A bug is fixed in the Prometheus alert definitions that caused false positive KubePodNotReadyControlPlane alerts related to the etcd compaction job.
Introduce `Spec.Backup.DeltaSnapshotRetentionPeriod` in the `Etcd` resource to allow configuring retention period for delta snapshots.
The deprecated `extensions/pkg/controller/worker.{Options,ApplyMachineResources{ForConfig}}` symbols have been dropped since `gardenlet` takes over management of the `machine.gardener.cloud/v1alpha1` API CRDs since `gardener/gardener@v1.73`.
Makefile targets have changed: Introduced gardener-setup, gardener-restore, gardener-local-mcm-up, non-gardener-setup, non-gardener-restore,  non-gardener-local-mcm-up. Users can also directly use the scripts which are used by these makefile targets.
A `generate-admin-kubeconf.sh` script which can be used to generate an admin kubeconfig for a local shoot cluster was added in the `hack/usage` directory.
The target cache for `gardener-resource-manager` is now unconditionally enabled, leading to faster reconciliations and less network I/O.
`nginx-ingress-controller` image is updated to `v1.9.4`.
A new field `errorCodeCheckFunc` is introduced in the generic `Worker` actuator. This should be set to parse the Gardener error codes from the error returned in `Worker` reconciliation.
Bump alpine base version for Docker build to `3.18.2`. 
Removed dead metrics code and refactored the remaining metrics code
The obsolete `addons` `ManagedResource` is now properly cleaned up.
Condition handling was improved for `Shoot`s of `ManagedSeed`s. Earlier, when unknown conditions were removed from seeds (e.g. maintained by third-party components), the affected condition was still present in the shoot's conditions.
Resolved an issue where the Custodian Controller was not updating the `Replicas` field in the `etcd` status to reflect the `CurrentReplicas` from the StatefulSet status. This fix ensures consistent behavior with the `etcd` Controller in Druid.
A new optional constraint `CRDsWithProblematicConversionWebhooks` is introduced in the `Shoot` status. This constraint indicates that there is at least one CRD in the cluster which has multiple stored versions and a conversion webhook configured, which could break the reconciliation flow of a `Shoot` in some cases.
Feature gates have been introduced in etcd-druid, and can be specified using CLI flag `--feature-gate`.
Applying Gardener resources server-side has caused the `the server is currently unable to handle the request` error which is now fixed.
Bump builder image golang from `1.20.4` to `1.20.6` 
It is now possible to trigger gardenlet kubeconfig renewal for unmanaged `Seed`s by annotating them with `gardener.cloud/operation=renew-kubeconfig`. This was already supported for `ManagedSeed`s only.
An edge case where outdated DesiredReplicas annotation blocked a rolling update is fixed.
The following golang dependencies have been upgraded, please consult the upstream release notes and [this issue](https://github.com/gardener/gardener/issues/8382) for guidance on upgrading your golang dependencies when vendoring this gardener version:
- `k8s.io/*` to `v0.28.2`
- `sigs.k8s.io/controller-runtime` to `v0.16.2`
- `sigs.k8s.io/controller-tools` to `v0.13.0`
Add support for `Local` provider for e2e tests.
When the `ShootForceDeletion` featuregate in the apiserver is turned on, users will be able to force-delete the Shoot. You **MUST** ensure that all the resources created in the IaaS account are cleaned up to prevent orphaned resources. Gardener will **NOT** delete any resources in the Shoot cloud-provider account. See [Shoot Force Deletion](https://github.com/gardener/gardener/blob/master/docs/usage/shoot_operations.md#force-deletion) for more details.
`gardener-apiserver` and `gardener-admission-controller` now mount `key` files with `DefaultMode` set to `416`(`0640` permissions).
`nginx-ingress-controller` image is updated to `v1.8.1` for Kubernetes`v1.24+` clusters.
The `.{source,target}ClientConnection.namespace` field has been renamed to `namespaces` and now takes a list of namespaces. The `.targetClientConnection.disableCachedClient` field has been removed.
The `Shoot` maintenance controller now updates the CRI of worker pools from `docker` to `containerd` when force-upgrading from Kubernetes `v1.22` to `v1.23`.
An issue causing several tasks from the Shoot reconciliation flow to fail with transient errors of type `duplicate filename in registry` is now fixed.
Bumped up the custom image version to v3.4.13-bootstrap-11
Introduce flag `metrics-scrape-wait-duration` to `etcdbrctl compact` command, that specifies a wait duration at the end of a snapshot compaction, to allow Prometheus to scrape metrics related to compaction before the `etcdbrctl` process exits.
A bug causing the crd generation for `druid.gardener.cloud` group to fail in extensions is now fixed.
The `DisablingScalingClassesForShoots` feature gate has been promoted to beta.
The `gardener-apiserver` now drops expired `Kubernetes` and `MachineImage` versions from `Cloudprofile`s during creation.
The component checklist is enhanced with 2 new rules for container images:
- Do not use container images from registries that don't support IPv6 - registries such as GHCR, ECR, MCR don't support image pulls over IPv6
- Do not use Shoot container images that are not multi-arch
Extensions running on seed clusters can get access to the garden cluster by using the injected kubeconfig specified by the `GARDEN_KUBECONFIG` environment variable. You can read about the details in this [doc](https://github.com/gardener/gardener/blob/master/docs/extensions/garden-api-access.md).
Update Kubernetes dependencies (especially `k8s.io/client-go`) from `v0.26.3` to `v0.26.4` to resolve panic on working with special shoots.
A bug has been fixed which was causing the garbage collector in `gardener-resource-manager` to wrongfully collect `Secret`s related to `ManagedResource`s when the source and the target cluster are equal.
A bug in the local development environment has been fixed which prevented admission of Gardener resources by extension webhooks.
Use `ginkgolinter` instead of self baked `gomegacheck`
Backup-restore waits for its etcd to be ready before attempting to update peerUrl
`Shoot`s allow to optionally configure a specific scheduler via `.spec.schedulerName`. The `default-scheduler` is used in case non is configured. Please note, that `Shoot`s will remain `Pending` in case a scheduler name is configured but an adequate scheduler is not available in the landscape.
The Plutono version has been updated from `v7.5.23` to `v7.5.24`.
Bump alpine base version for Docker build to `3.18.2`.
Adding Gardener-managed finalizers (e.g., `gardener` or `gardener.cloud/reference-protection`) to the `Shoot` on creation is now forbidden. 
Validation has been added for `spec.kubernetes.kubeAPIServer.runtimeConfig` field in the Shoot API. Disabling APIs marked as "Required" by gardener is not permitted.
A bug is fixed that prevented scraping the metrics of etcd in the shoot control plane.
⚠️ The deprecated fields `spec.settings.dependencyWatchdog.endpoint` and `spec.settings.dependencyWatchdog.probe` have been removed from the Seed API. Please check your `Seed`s and remove any usage before upgrading to this Gardener version.
Gardenlet can now set feature gates for `etcd-druid`. They can be specified via the gardenlet configuration `GardenletConfiguration.EtcdConfig.FeatureGates`
There is now a new script (`hack/check-skaffold-deps-for-binary.sh`) that can be used by gardener extensions to validate their skaffold ko dependencies.
The no longer required `--gardenlet-manages-mcm` option has been removed. All code in provider extensions related to management/deployment of `machine-controller-manager` should be removed.
The skaffold version is updated from v2.7.0 to v2.8.0.
Print build version and go runtime info.
`uncachedObjects` under pkg/client/kubernetes/options.go is now removed from Config struct which is used to set options for new ClientSets. Now the uncached objects can be directly set under `clientOptions.Cache.DisableFor` field.
Etcd snapshot compaction jobs will now be named `<etcd-name>-compactor` for better readability for human operators.
`github.com/gardener/gardener/pkg/utils/gardener.ShootAccessSecret` was renamed to `AccessSecret`.
`leader-election-resource-lock` flag is dropped and the leader-election resource-lock is hard coded to leases.
Add new flag `metrics-scrape-wait-duration` for compaction controller to set a wait duration at the end of every compaction job, to allow for metrics to be scraped by a Prometheus instance.
Status of `garden` now includes the `ObservabilityComponentsHealthy` condition which show the health of observability components in the garden runtime-cluster.
Custodian controller no longer watches leases owned by the etcd resources, thus reducing frequency of etcd status updates and now honouring `custodian-sync-period` value.
Gardener can now support clusters with Kubernetes version 1.28. In order to allow creation/update of 1.28 clusters you will have to update the version of your provider extension(s) to a version that supports 1.28 as well. Please consult the respective releases and notes in the provider extension's repository.
The following Golang dependencies have been updated:
- `k8s.io/*` from `v0.28.2` to `v0.28.3`
- `sigs.k8s.io/controller-runtime` from `v0.16.2` to `v0.16.3`
`gardener-resource-manager` now disables cache only for `Secrets` and `ConfigMap` if `DisableCachedClient` set to true.
APIServer validation allows updating to expired Kubernetes and machine image versions.
`pkg/utils/chart` does now support embedded charts. The already deprecated methods in the `ChartApplier` and `ChartRenderer` will be removed in a few releases, so extensions should adapt to embedded charts.
Add CVE categorization for etcd-backup-restore.
Developer Action Required: The `make deploy` command has been replaced with `make deploy-via-kustomize`. Please update your deployment workflows accordingly.
`custodian-sync-period` value is set to `15s` in the Helm chart for etcd-druid.
The shoot namespace in seeds is redeployed during the shoot migration flow to update the zones in use.
Several default settings of Kubernetes feature gates have been corrected.
The testmachinery tests now use `AdminKubeconfig` of the `Shoot`s of `ManagedSeed`s to create seed client.
`nginx-ingress-controller` image is updated to `v1.9.0`.
`gardener-operator` now takes over management of `gardener-metrics-exporter`.
The `alpha.kube-apiserver.scaling.shoot.gardener.cloud/class` annotation on `Shoot`s has no effect anymore and should be removed.
The `Secrets` type as well as the `Delete` functions for secrets were removed from `pkg/utils/managedresources/builder` since their usage was prone to errors. The higher level package `pkg/utils/managedresources` should be used instead.
Operators can now use the annotation `gardener.cloud/operation=rotate-observability-credentials` on the `garden` resource to rotate the observability credentials. 
A bug preventing `plutono` ingress to use `wildcard-certificate` is fixed.
Vendoring has been removed from the project, i.e., there is no `vendor` folder anymore.
The `extensions/pkg/controller/operatingsystemconfig/oscommon` package is deprecated and will be removed as soon as the `UseGardenerNodeAgent` feature gate has been promoted to GA. OS extension developers should start adapting to this new feature, see [documentation](https://github.com/gardener/gardener/blob/master/docs/extensions/operatingsystemconfig.md#what-needs-to-be-implemented-to-support-a-new-operating-system) and [example](https://github.com/gardener/gardener/tree/master/pkg/provider-local/controller/operatingsystemconfig) based on `provider-local`.
The `pkg/utils/gardener.IntStrPtrFromInt` function has been renamed to `IntStrPtrFromInt32` since `intstr.FromInt` is deprecated.
Update etcd-custom-image to `v3.4.26-2`.
Multiple expanders for `cluster-autoscaler` can now be specified in the `Shoot` API via the `.spec.kubernetes.clusterAutoscaler.expander` field.
Add an alert for VPNHAShootNoPods when shoot in HA (high availability) mode.
Upgrade gardener/gardener from `1.65.0` to `1.76.0`
Operators can now view and manage dashboards for compaction jobs running in shoot control plane.
Druid now exposes metrics related to snapshot compaction, on default port 8080. Please expose the desired metrics port via the etcd-druid service to allow metrics to be scraped by a Prometheus instance.
Fixed a bug that caused HVPA reconciliation to fail with `expected pointer, but got v2beta1.MetricSpec type` when the HPA spec had changed.
Update base image of `ingress-default-backend` to alpine:3.18.3
Etcd-backup-restore now uses a distroless image as its base image. It is no longer compatible with [etcd-custom-image](https://github.com/gardener/etcd-custom-image), and must be used with [etcd-wrapper](https://github.com/gardener/etcd-wrapper) instead. 
Following dependency has been updated:- 
- github.com/gardener/etcd-druid v0.18.1 -> v0.18.4
Updated to go v1.20.5
Bump `k8s.io/*` deps to v0.27.2
The two additional labels `worker.gardener.cloud/image-name` and `worker.gardener.cloud/image-version` that were previously introduced and attached to worker nodes are removed again to fix a regression that causes the `kubelet` to restart on nodes that are due to be upgraded to a new OS but not rolled yet which causes their `Pod`s to become temporarily unready.
The default `machine-safety-orphan-vms-period` has been reduced from 30m to 15m.
update client-go version and exclude the old one in go.mod
Feature gate `APIServerFastRollout` for `gardenlet` is introduced and enabled by default. When enabled, `maxSurge` for `kube-apiservers` of `Shoot`s is set to `100%`. 
 Fix an issue, where DNS lookups for non-existing pods of a StatefulSet yielded one of the existing pods even when it should not have. 
An issue has been fixed which caused CoreDNS to not rewrite CNAME values in DNS answers.
`kubectl get garden` now features additional printer columns providing more information about the substantial configuration values and statuses.
Add memory and cpu limits (maxAllowed) to Prometheus (H)VPAs.
The following images are updated:
- registry.k8s.io/metrics-server/metrics-server: v0.6.3 -> v0.6.4
- registry.k8s.io/cpa/cluster-proportional-autoscaler: v1.8.8 -> v1.8.9
- registry.k8s.io/coredns/coredns: v1.10.0 -> v1.10.1
- quay.io/prometheus/blackbox-exporter: v0.23.0 -> v0.24.0
- quay.io/prometheus/node-exporter: v1.5.0 -> v1.6.1
- ghcr.io/credativ/plutono: v7.5.22 -> v7.5.23
- ghcr.io/prometheus-operator/prometheus-config-reloader: v0.61.1 -> v0.67.1
- registry.k8s.io/dns/k8s-dns-node-cache: 1.22.20 -> 1.22.23
`nginx-ingress-controller` now enables annotation validation.
Added a new metric that will allow to get the number of stale (due to unhealthiness) machines  that are getting terminated
Update alpine base image version to 3.18.3.
A new feature gate named `ContainerdRegistryHostsDir` is introduced to gardenlet. When enabled, the `/etc/containerd/certs.d` directory is created on the Node and containerd is configured to look up for registries/mirrors configuration in this directory (if there is any configuration applied). In future, the [registry-cache extension](https://github.com/gardener/gardener-extension-registry-cache/) will add such registries/mirrors configuration under this directory (via OperatingSystemConfig mutation).
:warning: `etcd.Status.ClusterSize`, `etcd.Status.ServiceName`, `etcd.Status.UpdatedReplicas` have been marked as deprecated and users should refrain from depending on these fields.
`gardenlet` no longer reports the `Bootstrapped` condition on `Seed`s. Instead, it now reports the progress in `.status.lastOperation`, similar to how it's done for `Shoot`s.
When deploying this version of `gardener-operator`, make sure that you update your `Garden` resources with the new `.spec.virtualCluster.gardener.clusterIdentity` field. If you already have a `gardener-apiserver` deployment, make sure that the value matches the `--cluster-identity` flag of the current `gardener-apiserver` deployment.
Add `Care` reconciler to `Garden` controller in `gardener-operator`.
Etcd-related secrets will now be mounted onto the `/var/` directory instead of `/root/`.
It is possible now to trigger a seed reconciliation by annotating the Seed with `gardener.cloud/operation=reconcile`.
Update golang 1.20.4 -> 1.21.3
The `Secret` reconciler in `gardener-resource-manager` will now always remove its finalizer (if present).
Shoot fields `.spec.dns.providers[].domains` and `.spec.dns.providers[].zones` are now deprecated and expected to be removed in version `v1.87`. Please plan ahead to drop using those fields in extensions.
⚠️ Gardener does no longer support garden, seed, or shoot clusters with Kubernetes versions < 1.24. Make sure to upgrade all existing clusters before upgrading to this Gardener version.
Update golang base container image to 1.21.0.
The following mapper funcs from the extension library no longer accept a `context.Context` arg - `ClusterToContainerResourceMapper`, `ClusterToControlPlaneMapper`, `ClusterToDNSRecordMapper`, `ClusterToExtensionMapper`, `ClusterToInfrastructureMapper`, `ClusterToNetworkMapper`, `ClusterToWorkerMapper` and `ClusterToObjectMapper`. The `context.Context` arg was redundant and not used.
The `gardener-scheduler` now populates scheduling failure reasons to the `Shoot`'s `.status.lastOperation.description` field.
Prometheus scrape job configs for targets in the shoot cluster have been improved.
An issue causing the `etcd-backup` Secret to be wrongly deleted for a Shoot cluster due to stale BackupEntry deletion from a previous Shoot creation with the same name is now fixed.
`gardener-operator` now runs a new controller which protects `Secret`s and `ConfigMap`s with a finalizer in case they are referenced in `Garden` resources.
Added an example for `AdminKubeconfigRequest` via the Python Kubernetes client.
Gardener autoscaler now backs-off early from a node-group (i.e. machinedeployment) in case of `ResourceExhausted` error. Refer docs at `https://github.com/gardener/autoscaler/blob/machine-controller-manager-provider/cluster-autoscaler/FAQ.md#when-does-autoscaler-backs-off-early-from-a-node-group` for details.
A bug is fixed that rendered the "CPU usage" panel of the "VPN" Plutono dashboard blank.
A bug causing incorrect volume mount path for `Etcd`s and `EtcdCopyBackupsTask`s using `Local` snapshot storage provider while using distroless etcd-backup-restore image `v0.25.x` has been resolved.
When `Shoot`s were updated from non high-availability to `zone` high-availability, it could happen that the control-plane was scheduled to two instead of three zones. This issue is relevant for cloud providers with an inconsistent zone naming (`Azure` is currently the only candidate to our knowledge).
Existing shoots with the before mentioned problem must be fixed manually be operators if required. An automatic move of `etcd`s and their volumes is not part of this fix due to availability reasons.
The garbage collection controller now also considers managed resources when deciding if secrets/configmaps should be garbage collected.
Introduced `delta-snapshot-retention-period` CLI flag to extend the configurable retention period for delta snapshots in `etcd-backup-restore`, enhancing flexibility for backup retention.
Provider extensions must now pass the `cluster.Cluster` object for the garden cluster to the `genericactuator.NewActuator` function. See [this](https://github.com/gardener/gardener/blob/8d2f116aa606e5181cd430e5063dd798629bdc78/cmd/gardener-extension-provider-local/app/app.go#L228-L246) for an example how to create such a `cluster.Cluster` object.
Introduce DEPs (Druid Enhancement Proposals) for proposing large design changes in etcd-druid.
`kubectl proxy` now works as expected in the local development setup in conjunction with highly available vpn
Backupbucket/backupentry controllers: watch secret metadata only
machine-controller-manager RBAC in the Shoot cluster does now allow MCM to delete volumeattachments. MCM provider extensions vendoring machine-controller-manager >= v0.50.0 (ref https://github.com/gardener/machine-controller-manager/pull/839) need to delete volumeattachments.
`gardener-operator` now takes over management of `plutono`.
All default images are now present in `images.yaml`
The `charts/images.yaml` file was moved to `imagevector/images.yaml`.
Add Prometheus alert for pending seed pods
A bug causing `EtcdCopyBackupsTask` jobs to fail to create temp snapshot directory while using distroless etcd-backup-restore image `v0.25.x` has been resolved.
The deprecated `.spec.virtualCluster.dns.domain` field has been dropped from the `Garden` API. Make use of `.spec.virtualCluster.dns.domains`.
Etcd-druid will now deploy distroless `etcd-wrapper` and `etcd-backup-restore` images. Please refer to [etcd-wrapper](https://github.com/gardener/etcd-wrapper) for more information.
`gardener-operator` is now managing the Gardener control plane components (`gardener-{apiserver,admission-controller,controller-manager,scheduler}`).
The `node-local-dns` `ConfigMap` now has a label `k8s-app=node-local-dns` for identifying it.
Control plane components `kube-apiserver`, `kube-controller-manager` and `kube-scheduler` now run as `nonroot` user and group `65532`.
gardenlet: A regression causing metering related recording rules for the aggregate-prometheus not to be applied is now fixed.
File ownership for `var/etcd/data` will be changed to non-root user (65532).
Revendors the bbolt from `v1.3.6` to `v1.3.7`
metrics exposed by `cluster autoscaler` now scraped by `prometheus`
Concurrent empty machines bulk deletion can now be configured for `cluster-autoscaler` via the field `.spec.kubernetes.clusterAutoscaler.maxEmptyBulkDelete` in the `Shoot` API .
CloudProfiles allow configuring update strategies {patch, minor, major} for machine images that affect update behavior during auto and force update.
The `deltaSnapshotRetentionPeriod` parameter has been introduced in the `etcdConfig` section of the `GardenletConfiguration`. This new feature allows users to configure the retention period for delta snapshots in the ETCD component. By making the delta snapshot retention period configurable, we provide a more flexible debugging experience. Delta snapshots can now be retained for a user-defined duration, offering a valuable window for reviewing changes in case of any issues. 
A bug has been fixed which caused `ServiceAccount`s related to garden access secrets for extensions to leak in the seed namespace in the garden cluster after uninstallation of said extensions.
unit tests framework introduced to test implemented methods of `Cloudprovider` and `Nodegroup` interface
Add failure tolerance option to the `CreateShoot` test.
Removed `service.beta.kubernetes.io/aws-load-balancer-type: nlb` annotation from istio-ingressgateway service template. Set this annotation in seed configuration. Note: Changing load balancer type creates a new one, old one requires manual clean-up.
A bug was fixed which was causing existing `Bastion` resources on the garden cluster to not be deleted when `SSHAccess` is disabled on a Shoot cluster.
`UseEtcdWrapper` feature gate has been introduced to allow users to opt for the new [etcd-wrapper](https://github.com/gardener/etcd-wrapper) image.
It is now possible to annotate managed resources part of `ManagedResource` objects with `resources.gardener.cloud/finalize-deletion-after=<duration>`, e.g., `resources.gardener.cloud/finalize-deletion-after=1h`. After this time, `gardener-resource-manager` will forcefully delete the resource by removing their finalizers.
Gardener now allows to omit or to only partially define Kubernetes versions in `Shoot`s. The version will automatically be defaulted to the latest minor and/or patch version found in the linked `CloudProfile`.
The deprecated `core.gardener.cloud/apiserver-exposure` label and handling has been dropped.
Initial implementation for `Refresh()` method of `CloudProvider` interface done
The admission controllers of common provider extensions are automatically installed in the local extensions development setup
The `shoots/adminkubeconfig` relies on the `ca-client` `InternalSecret` only and does not use the `ShootState` object anymore.
Plutono is updated to v7.5.26.
Vali is updated to v2.2.11.
Kube-rbac-proxy is updated to v0.15.0.
The `pkg/utils/secrets` package now signs certificates with 3072 bit RSA keys.
Partial Shoot maintenance errors are now reported as events on the Shoot and in the Shoot's `LastMaintenance` status.
Test-machinery integration tests are now using upstream K8s e2e test images such as `registry.k8s.io/e2e-test-images/busybox`, `registry.k8s.io/e2e-test-images/agnhost` instead Gardener images such as `eu.gcr.io/gardener-project/3rd/busybox`, `eu.gcr.io/gardener-project/3rd/alpine` and others.
`gardener-operator` now refuses to start if operators attempt to downgrade or skip minor Gardener versions. Please see [this document](https://github.com/gardener/gardener/blob/master/docs/deployment/version_skew_policy.md) for more information.
`gardener-operator` no longer reports the `Reconciled` condition. Instead, it now reports the progress in `.status.lastOperation`, similar to how it's done for `Shoot`s.
Gardener now reports `node`s for which the `checksum/cloud-config-data` hasn't been populated yet. This could point towards an error on the node and that not all Gardener related configuration happened successfully.
Use cgroupv2 fix for local-setup on macOS too.
The `WorkerlessShoots` feature gate has been promoted to beta and is now turned on by default. Before deploying this Gardener version, make sure that all your registered extensions support this feature gate.
A bug has been fixed that prevented `ControllerInstallation`s from getting deleted when the backing `ControllerRegistration` with `.spec.deployment.policy={Always,AlwaysExceptNoShoots}` was deleted.
With this release the obervability compoents are updated to the latest release versions. Plutono is now at v2.5.25 and Vali is now at v2.2.9
It is now possible to configure `.spec.virtualCluster.gardener.gardenerAPIServer.auditWebhook` in the `Garden` API.
Added pod security enforce level `baseline` label to Istio-related namespaces. The `garden` and shoot namespaces have the `privileged` level. For extension namespaces, the new `security.gardener.cloud/pod-security-standard-enforce` annotation on  `ControllerRegistration` resources specifies the level. When set, the `extension` namespace is created with `pod-security.kubernetes.io/enforce` label set to `security.gardener.cloud/pod-security-standard-enforce`'s value.
`gardener-operator` now takes over management of `fluent-operator` and `vali`.
Alpine image used in init containers is now part of the IMAGEVECTOR_OVERWRITE
Shoot fields `.spec.dns.providers[].domains` and `.spec.dns.providers[].zones` are now deprecated and expected to be removed in version `v1.87`. Please use the extensions' configuration to configure providers with this ability.
Update alpine base image version to 3.18.4.
Add CVE categorization for etcd-backup-restore.
`default-domain`, `internal-domain`, `alerting` and `openvpn-diffie-hellman` secrets are removed from `gardener-controlplane` Helm chart. Please ensure to update them in a different way before upgrading Gardener. If you would like to prevent Helm from deleting these secret during the upgrade, you could annotate them with `"helm.sh/resource-policy": keep`.
The GA-ed `DisableScalingClassesForShoots` feature gate has been removed.
A bug has been fixed which was allowing users to specify an extension of the same type in `.spec.extensions[].type` more than once in the `Shoot` API.
The deprecated `ChartRenderer.Render` and `ChartApplier.{Apply,Delete}` methods have been dropped. Use `ChartRendere.RenderEmbeddedFS` and `ChartApplier.{Apply,Delete}FromEmbeddedFS` instead.
`gardener-operator` maintains the two most recent `generic-token-kubeconfig` secrets in the runtime-cluster. In addition the latest secret name is published to the `garden` resource in `.metadata.annotations[generic-token-kubeconfig.secret.gardener.cloud/name]`. Third-party components referring to this secret should check this annotation value after a credentials or CA rotation for the virtual-garden cluster took place.
Go version is updated to 1.20.6.
Machine scale-up delay for new pods can now be configured for `cluster-autoscaler` via the field `.spec.kubernetes.clusterAutoscaler.newPodScaleupDelay` in the `Shoot` API .
Update Prometheus job `tunnel-probe-apiserver-proxy` to fix for HA VPN mode
Update to golang v1.21
Updated go to 1.20.7
A bug causing unnecessary reorder of extension in `Shoot` `spec.extensions` is fixed.
Gardener can now support clusters with Kubernetes version 1.28. Extension developers have to prepare individual extensions as well to work with 1.28.
New `Secret`s referenced in `ManagedResource`s will no longer be patched with the label `resources.gardener.cloud/garbage-collectable-reference` when the `ManagedResource` is reconciled. `Secret`s which already exist in the `ManagedResource` specification will still be patched if necessary.
The `eu.gcr.io/gardener-project/gardener/autoscaler/cluster-autoscaler` image has been updated from `v1.26.2` to `v1.27.0` (for Kubernetes `>= 1.27`).
Removed apiserver-proxy pod webhook as it is now included in Gardener Resource Manager.
The `.status.lastOperation` in `core.gardener.cloud/v1beta1.Seed` and `operator.gardener.cloud/v1alpha1.Garden` resources is now only updated each `5s` during a reconciliation. Previously, it was updated immediately when a task was finished.

The following images 

Release notes were shortened since they exceeded the maximum length allowed for a pull request body. The remaining release notes will be added as comments to this PR.
gardener-robot-ci-3 commented 11 months ago

are updated:

gardener-robot commented 11 months ago

@gardener-robot-ci-3 Thank you for your contribution.