opendatahub-io / data-science-pipelines-operator

Apache License 2.0
13 stars 54 forks source link

handle the missing mlmd service-ca cert gracefully #728

Open HumairAK opened 1 month ago

HumairAK commented 1 month ago

This will handle the following dspo log message more gracefully:

2024-10-18T16:25:29-04:00       INFO    Applying ML-Metadata (MLMD) Resources   {"namespace": "dspa2", "dspa_name": "sample"}
2024-10-18T16:25:29-04:00       INFO    Updating components endpoints   {"namespace": "dspa2", "dspa_name": "sample"}
2024-10-18T16:25:29-04:00       ERROR   Reconciler error        {"controller": "datasciencepipelinesapplication", "controllerGroup": "datasciencepipelinesapplications.opendatahub.io", "controllerKind": "DataSciencePipelinesApplication", "DataSciencePipelinesApplication": {"name":"sample","namespace":"dspa2"}, "namespace": "dspa2", "name": "sample", "reconcileID": "0a19a4b2-fd85-4094-8be8-c7a57eddd4c3", "error": "secret containing the certificate for MLMD gRPC Server was not created yet"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /home/hukhan/go/1.21.3/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226

This is an expected state, and should not print a stack trace, but we still want to stacktrace other errors. So I've introduced a custom error for this scenario. In the future we should handle this via states for components, for example "state: loading" or "state: waitingOnDependency", instead of errors, but this should be approached consistently for all components, this pr is a simple fix until then.

Result:

2024-10-18T16:29:32-04:00       INFO    Applying ML-Metadata (MLMD) Resources   {"namespace": "dspa2", "dspa_name": "sample"}
2024-10-18T16:29:32-04:00       INFO    MLMD gRPC Server cert secret not found, this is likely because it has not been created yet      {"namespace": "dspa2", "dspa_name": "sample"}
2024-10-18T16:29:32-04:00       INFO    Updating components endpoints   {"namespace": "dspa2", "dspa_name": "sample"}
openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from humairak. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/opendatahub-io/data-science-pipelines-operator/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
dsp-developers commented 1 month ago

Change to PR detected. A new PR build was completed. A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-728

dsp-developers commented 1 month ago

Change to PR detected. A new PR build was completed. A new image has been built to help with testing out this PR: quay.io/opendatahub/data-science-pipelines-operator:pr-728

VaniHaripriya commented 1 month ago

/lgtm

openshift-merge-robot commented 1 month ago

PR needs rebase.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
diegolovison commented 3 weeks ago

Is it possible to have a test for this?