opendatahub-io / opendatahub-operator

Open Data Hub operator to manage ODH component integrations
https://opendatahub.io
Apache License 2.0
59 stars 127 forks source link

fix: do not reconcile trusted bundle on ns deletion #1095

Closed bartoszmajsak closed 2 months ago

bartoszmajsak commented 2 months ago

Description

When the namespace managed by the operator is deleted, the reconcile is triggered trying to inject trusted CA bundle back to it, even though the namespace does not exist at this point (or is being terminated).

This is due to the default DeleteFunc being used that triggers reconcile.

In this commit the predicate has been changed to ignore all Delete events when handling Namespace events. It also enhances the check for injection to ensure it is only handling active namespaces. In addition, the redundant default DeleteFunc has been removed from ConfigMapChangedPredicate.

Without this patch reconcile errors similar to these can be observed:

{
  "level": "error",
  "ts": "2024-07-03T09:39:36Z",
  "msg": "Reconciler error",
  "controller": "cert-configmap-generator-controller",
  "namespace": "opendatahub-auth-provider",
  "name": "odh-trusted-ca-bundle",
  "reconcileID": "d931966e-4ddc-49d3-a24b-09e581b0d91e",
  "error": "error getting namespace to inject trustedCA bundle: Namespace \"opendatahub-auth-provider\" not found",
  "errorVerbose": "Namespace \"opendatahub-auth-provider\" not found\nerror getting namespace to inject trustedCA bundle",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235"
}
{
  "level": "error",
  "ts": "2024-07-03T10:04:41Z",
  "msg": "Reconciler error",
  "controller": "cert-configmap-generator-controller",
  "namespace": "opendatahub-auth-provider",
  "name": "odh-trusted-ca-bundle",
  "reconcileID": "b5e98460-1143-407c-a011-486bb0045b77",
  "error": "configmaps \"odh-trusted-ca-bundle\" is forbidden: unable to create new content in namespace opendatahub-auth-provider because it is being terminated",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235"
}

Related issues:

Follow-up for #900 PR.

How Has This Been Tested?

[!IMPORTANT] I cannot find any existing test cases where I could add some new ones capturing this buggy behaviour. Are there any?

Screenshot or short clip

Merge criteria

bartoszmajsak commented 2 months ago

@VaishnaviHire @zdtsw How should we proceed with JIRA here? Re-open issues.redhat.com/browse/RHOAIENG-4233 or new one is needed?

zdtsw commented 2 months ago

@VaishnaviHire @zdtsw How should we proceed with JIRA here? Re-open issues.redhat.com/browse/RHOAIENG-4233 or new one is needed?

we can create another jira for this PR i read the original jira description again, and i did not think about the reconcile part at that time.

we can give a bit more detail in the new jira if QE would like to verify it

bartoszmajsak commented 2 months ago

@VaishnaviHire @zdtsw How should we proceed with JIRA here? Re-open issues.redhat.com/browse/RHOAIENG-4233 or new one is needed?

we can create another jira for this PR i read the original jira description again, and i did not think about the reconcile part at that time.

we can give a bit more detail in the new jira if QE would like to verify it

That's done now https://issues.redhat.com/browse/RHOAIENG-9431. I will let you set priorities and other fields as you wish. I hope the description is precise enough.

zdtsw commented 2 months ago

/LGTM leave for @VaishnaviHire for another round of check.

bartoszmajsak commented 2 months ago

/retest

openshift-ci[bot] commented 2 months ago

New changes are detected. LGTM label has been removed.

openshift-ci[bot] commented 2 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from zdtsw. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
bartoszmajsak commented 2 months ago

@VaishnaviHire that is fixed now, thanks for reviewing

bartoszmajsak commented 2 months ago

/retest