thoth-station / meteor-operator

Project Meteor Operator for OpenShift
GNU General Public License v3.0
4 stars 11 forks source link

CNBi conditions are not declarative #130

Open VannTen opened 1 year ago

VannTen commented 1 year ago

Problem statement

Many of the conditions defined in api/v1alpha/cnbi_conditions_consts.go describe state changes instead of state. This goes against the recommendations in the kubernetes API conventions relative to conditions, particularly:

Condition type names should describe the current observed state of the resource, rather than describing the current state transitions.

Take a look at controllers/cnbi/controller.go:

meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ErrorPipelineRunCreate,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCreateFailed",
    Message: err.Error(),
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.PipelineRunCreated,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCreated",
    Message: fmt.Sprintf("%s PipelineRun created successfully", pipelineRun.Name),
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.GenericPipelineError,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunGenericError",
    Message: err.Error()},
)
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.PipelineRunCompleted,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCompleted",
    Message: "The PipelineRun has been completed successfully.",
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ImageImportReady,
    Status:  metav1.ConditionTrue,
    Reason:  "ImageImportReady",
    Message: "Import succeeded, the image is ready to be used.",
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.PipelineRunCompleted,
    Status:  metav1.ConditionTrue,
    Reason:  "PipelineRunCompleted",
    Message: "The PipelineRun has been completed with a failure!",
})
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ImageImportReady,
    Status:  metav1.ConditionFalse,
    Reason:  "ImageImportNotReady",
    Message: "Import failed, this could be due to the repository to import from does not exist or is not accessible",
...
meta.SetStatusCondition(&cnbi.Status.Conditions, metav1.Condition{
    Type:    meteorv1alpha1.ErrorBuildingImage,
    Status:  metav1.ConditionTrue,
    Reason:  "ErrorBuildingImage",
    Message: "Build failed!",

(this is the code in #129, but it only changes the syntax, not the conditions name or the reasons) For nearly all the conditions, we use a Reason which is the same that the condition type; I think our Reasons are fine (describe state transitions), but our conditions types are not.

High-level Goals

Be more Kubernetes-like when implementing the Conditions.

Proposal description

List of possible alternative conditions:

BaseImageAvailable
ImageAvailable
ImageIsValid
ThothAdviseScheduled (see K8s API conventions on long state transitions)
ThothAdviseAvailable

Additional context

126 is relevant; tackling the too together might be a good idea.

Consider in particular the following scenario:

  1. A CNBi has been created, the ImageStream has been created and the image built and pushed.
  2. The ImageStream get deleted. We should propagate that change, and none of the existing conditions fits. (we should also re-launch a pipeline to reconcile the state).

Acceptance Criteria

References

/wg cnbi /priority important-longterm /group-programming

codificat commented 1 year ago

Related: https://github.com/thoth-station/meteor-operator/issues/103

codificat commented 1 year ago

A requirement here is to make sure that the odh-dashboard works with the changes made to the conditions

VannTen commented 1 year ago

One point what's not clear to me and can impact the way we design the set of conditions:

What should be the relation to the Shower/Meteor resources ? If I'm correct, with CRE we handle the build part, and the run part should stay in those resources ? Or should we handle those as "subresources" and hides them behind the CRE ?

In particular, I think this should impact whether or not we have a Running condition for in the CRE status.

wdyt @goern @codificat ? (this might be a good point to discuss on CNBi meeting Wednesday.

goern commented 1 year ago

Meteor CRD & friend should stay for compatibility, cre does not need to take control over them, if the meteor build can be handled by CRE it could (prio--)

From my pov meteor and cre is pretty disjoint and decoupled

Does this help? In line with other minds?

VannTen commented 1 year ago

On Tue, Jan 10, 2023 at 02:12:04AM -0800, Christoph Görn wrote:

Meteor CRD & friend should stay for compatibility, cre does not need to take control over them, if the meteor build can be handled by CRE it could (prio--)

From my pov meteor and cre is pretty disjoint and decoupled

Does this help? In line with other minds?

My understanding of the Meteor resources is that, given a predefined image, it uses it to present the notebook to the user, correct ?

If so, inside the CRE, do we only need to care about building, and just leave the running part exactly the same, handled by the Meteor resource ?

TL;DR: