SovereignCloudStack / cluster-stack-operator

The SCS Cluster Stack Operator takes care of life cycle management, configuration and provider specific tasks of Kubernetes clusters created with SCS Cluster Stacks
https://scs.community/
Apache License 2.0
13 stars 3 forks source link

BeforeClusterUpgrade hook blocks Cluster upgrades #259

Closed chess-knight closed 1 month ago

chess-knight commented 2 months ago

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.] I upgraded the cluster by changing the topology.class and topology.version fields in the Cluster object. Cluster API is not starting the upgrade process, instead, I see in the CAPI logs:

Calling all extensions of hook "BeforeClusterUpgrade"
Calling extension handler "before-cluster-upgrade.cso-hook-server-extensionconfig"
extension handler returned blocking response with retryAfterSeconds of 10
Cluster upgrade to version "v1.28.13" is blocked by "BeforeClusterUpgrade" hook

and CSO logs:

"file":"extension/hooks.go:56","message":"DoBeforeClusterUpgrade is called"

What did you expect to happen: Successful upgrade of the cluster.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] I do not use multi-stage addons. ClusterAddon CR is READY and inside .spec has already upgraded clusterStack and version fields. Also, caddon is missing HelmChartApplied in .status.conditions.

Environment:

janiskemper commented 2 months ago

I think that the controller takes the cluster stack assets as base for the decision whether there are multi-stage cluster addons or not. Apparently, it thought here, that there are. I think there might be something with your folder structure of the clusteraddon/ folder or the clusteraddon.yaml

chess-knight commented 2 months ago

I think that the controller takes the cluster stack assets as base for the decision whether there are multi-stage cluster addons or not. Apparently, it thought here, that there are. I think there might be something with your folder structure of the clusteraddon/ folder or the clusteraddon.yaml

I am using https://github.com/chess-knight/cluster-stacks/releases/tag/openstack-scs-1-28-v2 release and trying to upgrade on https://github.com/chess-knight/cluster-stacks/releases/tag/openstack-scs-1-28-v3. These releases are based on https://github.com/SovereignCloudStack/cluster-stacks/tree/release-1.28/providers/openstack/scs structure. So IMO these are not multi-stage cluster-addons. I created them by the following:

csctl create providers/openstack/scs/ -m custom --cluster-stack-version v2 --cluster-addon-version v2 --node-image-version v1
csctl create providers/openstack/scs/ -m custom --cluster-stack-version v3 --cluster-addon-version v3 --node-image-version v2
janiskemper commented 2 months ago

tbh I also don't see the reason right now. Did you find the place in the code where it is decided whether we follow the multi stage way or not? I don't see right now why it takes the wrong decision in your case

chess-knight commented 2 months ago

tbh I also don't see the reason right now. Did you find the place in the code where it is decided whether we follow the multi stage way or not? I don't see right now why it takes the wrong decision in your case

I can see that the BeforeClusterUpgrade hook https://github.com/SovereignCloudStack/cluster-stack-operator/blob/v0.1.0-alpha.7/extension/hooks.go#L54 is called also for legacy cluster addons. And I see that something removes "HelmChartApplied" condition and that's why it fails. But I don't know if it is the hook itself or the caddon controller.

janiskemper commented 2 months ago

sorry, I was on the wrong track here.. The hooks are of course always called! The question of multi-stage or not is irrelevant for the hooks. That's just a question on how many Helm charts there are and what format they have.

chess-knight commented 2 months ago

sorry, I was on the wrong track here.. The hooks are of course always called! The question of multi-stage or not is irrelevant for the hooks. That's just a question on how many Helm charts there are and what format they have.

Yeah, there is one helm chart with 5 dependencies. But all of them are successfully applied to the workload cluster.

janiskemper commented 2 months ago

so it works or it doesn't work?

chess-knight commented 2 months ago

As I said, the upgrade of k8s is blocked by BeforeClusterUpgrade hook.

janiskemper commented 2 months ago

so how does the ClusterAddon object look like? The status should show what's going on.. Or events, logs, etc.

chess-knight commented 1 month ago

This is strange, I cannot reproduce it. Upgrade just works!

janiskemper commented 1 month ago

that's awesome to know! Maybe you saw the above logs and thought that whatever problem you had came from that. However, as I realized then later, it shouldn't be a problem to you at all. I guess there was something else causing the problem. The hooks should just work!

Closing this because I think that there is no issues anymore.