scylladb / scylla-operator

The Kubernetes Operator for ScyllaDB
https://operator.docs.scylladb.com/
Apache License 2.0
331 stars 163 forks source link

NodeConfig needs to aggregate status conditions so they can be queried #1557

Closed tnozicka closed 1 week ago

tnozicka commented 10 months ago

What should the feature do?

The needs to be a way to asses whether NodeConfig has succeded but we only have conditions like RaidControllerNodeubuntu-2204Degraded where the node is variable and impractical/impossible to query.

What is the use case behind this feature?

Visibility and debugability. When say a RAID setup or mounts fail, it need to fail at the correct place where we wait for the NodeConfig to apply and not 5 command later when manager or scyllacluster breaks.

Anything else we need to know?

No response

tnozicka commented 10 months ago

Ideally add a test with broken NodeConfig a) on RAID b) on mounts so we know the final conditions also work

tnozicka commented 7 months ago

when we have the aggregated conditions we should update printers as well, see #1730

rzetelskik commented 4 months ago

Ideally add a test with broken NodeConfig a) on RAID b) on mounts so we know the final conditions also work

@tnozicka do you know any simple reproducers of the two cases?

tnozicka commented 4 months ago

I suppose a) reference unexisting device or a broken one b) mounting over an existing mount

rzetelskik commented 4 months ago

@tnozicka I need some insight into the intended semantics of the current conditions.

  1. There's NodeConfigReconciledConditionType now: https://github.com/scylladb/scylla-operator/blob/c3e79162e1d6f24f2eb392bb1d26ab81c0876426/pkg/api/scylla/v1alpha1/types_nodeconfig.go#L29 What is the exact meaning of it? The comment says Reconciled indicates that the NodeConfig is fully deployed and available - what does "fully deployed" mean exactly? https://github.com/scylladb/scylla-operator/blob/c3e79162e1d6f24f2eb392bb1d26ab81c0876426/pkg/controller/nodeconfig/status.go#L19 The status calculation suggests that it refers to the required DaemonSets being deployed - I assume that with DaemonSet controller conditions this will no longer be needed?
  2. What does Available condition mean in the context of NodeConfig? It's not really intuitive and I don't see any of the controllers setting it in the current implementation.
  3. Should the aggregated node conditions be taken into account when aggregating the general conditions? In other words, should the aggregated NodeConfig conditions also reflect conditions of particular node setups?
tnozicka commented 4 months ago

The status calculation suggests that it refers to the required DaemonSets being deployed

update+available

I assume that with DaemonSet controller conditions this will no longer be needed? What does Available condition mean in the context of NodeConfig? It's not really intuitive and I don't see any of the controllers setting it in the current implementation.

The thing with nodeconfig is that it technically may not have a long running workload to call it "available", so I chose "Reconciled" initially, before we had the generic sync functions or the other patterns. I suppose we can use the available as well here in a broader sense. I guess implementation wise we'll always have to have something long running to make it level based, unless kubernetes exposes those through an API.

Should the aggregated node conditions be taken into account when aggregating the general conditions? In other words, should the aggregated NodeConfig conditions also reflect conditions of particular node setups?

Can you elaborate here or give me an example, I am not sure I got the point.

rzetelskik commented 4 months ago

I suppose we can use the available as well here in a broader sense.

Wouldn't progressing make more sense? It seems better aligned with ScyllaCluster ans StatefulSets for example.

Can you elaborate here or give me an example, I am not sure I got the point.

For example, should RaidControllerNodeubuntu-2204Degraded be aggregated into general Degraded? Same for available/progressing.

tnozicka commented 4 months ago

Wouldn't progressing make more sense? It seems better aligned with ScyllaCluster ans StatefulSets for example.

Progressing and Available are given. I though we are talking Reconciled vs. Available.

For example, should RaidControllerNodeubuntu-2204Degraded be aggregated into general Degraded?

RaidControllerNodeubuntu-2204Degraded => RaidControllerDegraded => Degraded

But there is also the aggregation per node. It's all a bit weird since some conditions come from the workload but fot the aggregation purposes, I'd just do: *Available => Available *Progressing => Progressing *Degraded => Degraded

rzetelskik commented 4 months ago

Thanks for all the responses so far.

RaidControllerNodeubuntu-2204Degraded => RaidControllerDegraded => Degraded

With how this is implemented now, I was thinking this should be more of a RaidControllerNodeubuntu-2204Degraded => Nodeubuntu-2204Degraded => Degraded flow? Because the node conditions are aggregated per node now (RaidControllerNodeubuntu-2204Degraded => Nodeubuntu-2204Degraded), so a more natural approach to me would be to aggregate all the NodeConfig controller conditions (like DaemonSetControllerDegraded, which are not set up now at all) + all the node aggregates (Nodeubuntu-2204Degraded etc.). So the NodeSetup controller would do the per-node aggregation, and the NodeConfig controller would do the top-level aggregation. Otherwise the NodeConfig controller would have to aggregate partial conditions coming from independent controllers.

tnozicka commented 4 months ago

I don't find the "mid" aggregations particularly useful (and there is a lot of options), so I'd likely only generically aggregate the *Degraded => Degraded and leave the rest when we need it. But the only aggregation case I see useful at this point is to reasonably wait for it to be successfully done which is covered by the wildcard aggregation without any intermediate steps.

rzetelskik commented 4 months ago

I don't find the "mid" aggregations particularly useful (and there is a lot of options)

I think it's useful because the NodeConfig controller has to read these conditions from the status, since they come from different controllers (NodeSetup), so if they're not there - we have to assume a worst-case default. I guess it's easier to do per-matching node then per matching node AND per each controller of NodeSetup. But ok, I understand the conclusion is the intermediate conditions are an implementation detail.

*Degraded => Degraded

I don't think I follow what this is about.

tnozicka commented 4 months ago

I was think about just taking the existing conditions on the status and doing

    degradedCondition, err := AggregateStatusConditions(
        FindStatusConditionsWithSuffix(*allConditions, "Degraded"),
        metav1.Condition{
            Type:               "Degraded",
            Status:             metav1.ConditionFalse,
            Reason:             internalapi.AsExpectedReason,
            Message:            "",
            ObservedGeneration: generation,
        },
    )

probably fixing the logic to check observed generations since the conditions come from different places

scylla-operator-bot[bot] commented 2 months ago

The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

/lifecycle stale

rzetelskik commented 2 months ago

/remove-lifecycle stale /triage accepted

tnozicka commented 2 weeks ago

we also need a way of figuring out if all the asynchronous statuses has been reported already (from node setup pods)