opendatahub-io / opendatahub-operator

Open Data Hub operator to manage ODH component integrations
https://opendatahub.io
Apache License 2.0
59 stars 127 forks source link

fix: ensures SMCP is created before any other features #1118

Closed bartoszmajsak closed 1 month ago

bartoszmajsak commented 1 month ago

Description

With the #1052 refactoring, the order of features added to the Registry was accidentally changed. It results in failing of metrics collection feature which expects SMCP to be created first, but the creation runs afterwards. The setup is eventually consistent, as the reconcile will retry, so this not a bug per se, but results in unnecassary errors.

This fix ensures features are ordered as before and levarages .EnabledWhen instead of wrapping features in ifs.

How Has This Been Tested?

BEFORE: Reconcile log ```json [ { "level": "info", "ts": "2024-07-12T07:52:53Z", "logger": "features", "msg": "waiting for control plane components to be ready", "feature": "mesh-metrics-collection", "control-plane": "data-science-smcp", "namespace": "istio-system", "duration (s)": 300 }, { "level": "error", "ts": "2024-07-12T07:52:55Z", "logger": "features", "msg": "failed waiting for control plane being ready", "feature": "mesh-metrics-collection", "control-plane": "data-science-smcp", "namespace": "istio-system", "error": "failed to find Service Mesh Control Plane: servicemeshcontrolplanes.maistra.io \"data-science-smcp\" not found", "stacktrace": "github.com/opendatahub-io/opendatahub-operator/v2/pkg/feature/servicemesh.EnsureServiceMeshInstalled\n\t/workspace/pkg/feature/servicemesh/conditions.go:55\ngithub.com/opendatahub-io/opendatahub-operator/v2/pkg/feature.(*Feature).applyFeature\n\t/workspace/pkg/feature/feature.go:110\ngithub.com/opendatahub-io/opendatahub-operator/v2/pkg/feature.(*Feature).Apply\n\t/workspace/pkg/feature/feature.go:93\ngithub.com/opendatahub-io/opendatahub-operator/v2/pkg/feature.(*FeaturesHandler).Apply\n\t/workspace/pkg/feature/handler.go:66\ngithub.com/opendatahub-io/opendatahub-operator/v2/pkg/feature.HandlerWithReporter[...].Apply\n\t/workspace/pkg/feature/handler.go:138\ngithub.com/opendatahub-io/opendatahub-operator/v2/controllers/dscinitialization.(*DSCInitializationReconciler).configureServiceMesh\n\t/workspace/controllers/dscinitialization/servicemesh_setup.go:41\ngithub.com/opendatahub-io/opendatahub-operator/v2/controllers/dscinitialization.(*DSCInitializationReconciler).Reconcile\n\t/workspace/controllers/dscinitialization/dscinitialization_controller.go:267\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235" }, { "level": "info", "ts": "2024-07-12T07:52:55Z", "logger": "features", "msg": "waiting for pods to become ready", "feature": "mesh-control-plane-creation", "namespace": "istio-system", "duration (s)": 300 }, { "level": "info", "ts": "2024-07-12T07:53:21Z", "logger": "features", "msg": "done waiting for pods to become ready", "feature": "mesh-control-plane-creation", "namespace": "istio-system" }, { "level": "error", "ts": "2024-07-12T07:53:21Z", "logger": "opendatahub.controllers.DSCInitialization", "msg": "failed applying service mesh resources", "error": "1 error occurred:\n\t* failed applying FeatureHandler features. cause: 1 error occurred:\n\t* 2 errors occurred:\n\t* failed to find Service Mesh Control Plane: servicemeshcontrolplanes.maistra.io \"data-science-smcp\" not found\n\t* service mesh control plane is not ready\n\n\n\n\n\n", "stacktrace": "github.com/opendatahub-io/opendatahub-operator/v2/controllers/dscinitialization.(*DSCInitializationReconciler).configureServiceMesh\n\t/workspace/controllers/dscinitialization/servicemesh_setup.go:43\ngithub.com/opendatahub-io/opendatahub-operator/v2/controllers/dscinitialization.(*DSCInitializationReconciler).Reconcile\n\t/workspace/controllers/dscinitialization/dscinitialization_controller.go:267\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235" }, { "level": "error", "ts": "2024-07-12T07:53:21Z", "msg": "Reconciler error", "controller": "dscinitialization", "controllerGroup": "dscinitialization.opendatahub.io", "controllerKind": "DSCInitialization", "DSCInitialization": { "name": "default-dsci" }, "namespace": "", "name": "default-dsci", "reconcileID": "929c88d0-cec4-4466-b844-3e3cb288ddd2", "error": "1 error occurred:\n\t* failed applying FeatureHandler features. cause: 1 error occurred:\n\t* 2 errors occurred:\n\t* failed to find Service Mesh Control Plane: servicemeshcontrolplanes.maistra.io \"data-science-smcp\" not found\n\t* service mesh control plane is not ready\n\n\n\n\n\n", "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/opt/app-root/src/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235" }, { "level": "info", "ts": "2024-07-12T07:53:21Z", "logger": "features", "msg": "waiting for control plane components to be ready", "feature": "mesh-metrics-collection", "control-plane": "data-science-smcp", "namespace": "istio-system", "duration (s)": 300 }, { "level": "info", "ts": "2024-07-12T07:53:23Z", "logger": "features", "msg": "done waiting for control plane components to be ready", "feature": "mesh-metrics-collection", "control-plane": "data-science-smcp", "namespace": "istio-system" }, { "level": "info", "ts": "2024-07-12T07:53:23Z", "logger": "features", "msg": "waiting for pods to become ready", "feature": "mesh-control-plane-creation", "namespace": "istio-system", "duration (s)": 300 }, { "level": "info", "ts": "2024-07-12T07:53:25Z", "logger": "features", "msg": "done waiting for pods to become ready", "feature": "mesh-control-plane-creation", "namespace": "istio-system" }, ```
AFTER: Reconcile log ```json [ { "level": "info", "ts": "2024-07-12T10:11:48+02:00", "logger": "features", "msg": "waiting for pods to become ready", "feature": "mesh-control-plane-creation", "namespace": "istio-system", "duration (s)": 300 }, { "level": "info", "ts": "2024-07-12T10:11:54+02:00", "logger": "features", "msg": "done waiting for pods to become ready", "feature": "mesh-control-plane-creation", "namespace": "istio-system" }, { "level": "info", "ts": "2024-07-12T10:11:56+02:00", "logger": "features", "msg": "waiting for control plane components to be ready", "feature": "mesh-metrics-collection", "control-plane": "data-science-smcp", "namespace": "istio-system", "duration (s)": 300 }, { "level": "info", "ts": "2024-07-12T10:11:58+02:00", "logger": "features", "msg": "done waiting for control plane components to be ready", "feature": "mesh-metrics-collection", "control-plane": "data-science-smcp", "namespace": "istio-system" }, { "level": "info", "ts": "2024-07-12T10:12:03+02:00", "logger": "features", "msg": "waiting for control plane components to be ready", "feature": "mesh-control-plane-external-authz", "control-plane": "data-science-smcp", "namespace": "istio-system", "duration (s)": 300 }, { "level": "info", "ts": "2024-07-12T10:12:06+02:00", "logger": "features", "msg": "done waiting for control plane components to be ready", "feature": "mesh-control-plane-external-authz", "control-plane": "data-science-smcp", "namespace": "istio-system" }, ```

Screenshot or short clip

Merge criteria

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zdtsw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/OWNERS)~~ [zdtsw] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
zdtsw commented 1 month ago

/test opendatahub-operator-e2e