Closed tmshort closed 1 month ago
Attention: Patch coverage is 56.66667%
with 13 lines
in your changes missing coverage. Please review.
Project coverage is 77.93%. Comparing base (
9f0c6a9
) to head (928e1ba
).
Files | Patch % | Lines |
---|---|---|
internal/controllers/common_controller.go | 0.00% | 8 Missing :warning: |
...nternal/controllers/clusterextension_controller.go | 77.27% | 5 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@m1kola (github doesn't support comment threads), I do believe we need to distinguish between Resolution errors, which likely involve errors in the ClusterExtension
resource, vs. other errors that can occur on the cluster. In the first case, the user may need to fix it, and can likely do it themselves. In the other case, it may be a temporal error on the cluster, or something else that the user cannot handle.
PR needs rebase.
Name | Link |
---|---|
Latest commit | 928e1baeb73a5948332d1da0a7ee1c5bbaa62099 |
Latest deploy log | https://app.netlify.com/sites/olmv1/deploys/665f2fbc41879700082961cb |
Deploy Preview | https://deploy-preview-878--olmv1.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
@m1kola (github doesn't support comment threads), I do believe we need to distinguish between Resolution errors, which likely involve errors in the
ClusterExtension
resource, vs. other errors that can occur on the cluster. In the first case, the user may need to fix it, and can likely do it themselves. In the other case, it may be a temporal error on the cluster, or something else that the user cannot handle.
@tmshort handleResolutionErrors
only handles resolution errors (only errors returned from r.resolve(ctx, *ext)
).
How I see it:
ClusterExtension
spec (e.g. we can't find a package) - we want to make sure that the error is very human readable and appears on the condition. Human can take an action based on the error message.In all the these cases we can fail resolution condition and write error into the condition message. I don't think that setting reason to InstallationStatusUnknown
aids UX.
There is a chance that I'm missing something (e.g. maybe you expect programmatic clients to take advantage of a distinct combination of reasons and take some action?). Happy to discuss it further here or jump on a quick call if you like.
@ankitathomas's PR is super relevant to the discussion of temporal/transient vs permanent. https://github.com/operator-framework/operator-controller/pull/842
I think the Progressing
condition that we've discussed should make it much easier for us to tell users "we're going to try again" vs "we're not trying again, there is something you need to fix"
@m1kola
handleResolutionErrors
only handles resolution errors (only errors returned fromr.resolve(ctx, *ext)
).
Non-resolution (e.g. timeout) errors can occur in r.resolve()
, and thus any kind of errors are handled by handleResolutionErrors
. Would it make more sense to remove the function (since it's a lot smaller now), and put the functionality into the main function?
This is just trying to clean up error processing. It's not meant to clean up the status conditions (but it has to do something with status conditions). It looks as though #842 changed title, so it's doing something different now, issue #880 is meant to clean up the status conditions. The hope is that we can have smaller changes (which is why I'm trying to keep this small).
This PR is focused on removing the string checks for error types, and all the aggregated errors that were being used - issues noted in the big Helm PR #846. It was not meant to fix all the problems with status conditions.
Description
Reviewer Checklist