strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.87k stars 1.3k forks source link

[Enhancement]: Support single step multi version downgrade for Zookeeper based clusters #10801

Open MichaelMorrisEst opened 2 weeks ago

MichaelMorrisEst commented 2 weeks ago

Related problem

No response

Suggested solution

Strimzi supports multi-version single step upgrade for Zookeeper based clusters, however downgrades must be handled step by step ensuring the 'to' Strimzi version in each step supports the current Kafka version. During downgrade an attempt is made to read the Kafka information for the 'from' Kafka version by the 'to' version of the operator. When the 'from' Kafka version is not supported by the 'to' Strimzi version an error will be thrown because the version is unknown. Information such as the zookeeper version, interbroker protocol message format, log message format version are read from kafka-versions.yaml during the creation of a KafkaVersion object to represent the 'from' kafka version and the error message is generated as it cannot find the information for the unknown version.

However, while this information is important to know for the 'to' version and in upgrade for the 'from' version, it is not important for the 'from' version in the downgrade. Therefore it is possible to support multi-version single step downgrade by simply adding handling to not throw an exception when the 'from' version is not known in a downgrade. The only place any of this information for the 'from' version is used in the downgrade is in a log message to log if zookeeper needs to be downgraded. While this might be a useful log to output when the information is available, I dont think it could be considered important enough to be a reason for not supporting multi version single step downgrade.

I would like to propose to introduce the necessary handling for encountering an unknown version in downgrade in order to enable single step multi version downgrade. I will submit a PR with a proposed implementation to demonstrate the impacts I have identified.

Alternatives

No response

Additional context

No response

scholzj commented 2 weeks ago

I think this needs to be carefully considered and probably have a proposal. This opens a wide range of possible issues because the old versions might not understand what needs to be done to downgrade the unknown versions.

fvaleri commented 2 weeks ago

I also think this is tricky and requires a proposal. The message format could change between releases, even if this is unlikely at this point, but most importantly we should handle IBP and feature downgrade (KIP-584, KIP-1022). Currently, KRaft metadata.version cannot be downgraded, as it is still evolving (read point 3 here: https://kafka.apache.org/documentation/#upgrade_380_kraft). Maybe this would be easier in Kafka 4.x, but who knows.

im-konge commented 1 week ago

Triaged on 14.11.2024: @MichaelMorrisEst we agreed that this needs a proposal, could you please have a look on writing one before proceeding with the PR?

MichaelMorrisEst commented 1 week ago

Yes, no problem, I will write and submit a proposal

scholzj commented 1 week ago

I wasn't able to join the community call because of KubeCon. But ZooKeeper will be removed soon. But is this still worth it? Will we be able to get it into Strimzi 0.45 (As ZooKeeper will be gone right afterwards)?

scholzj commented 1 week ago

Also ... should this be handled properly and include KRaft based clusters?

scholzj commented 1 week ago

Actually, it is even worse ... the change will be at best only introduced in Strimzi 0.45. But there will no downgrade of ZooKeeper-based clusters anymore because there will be no Kafka 3.10 and ZooKeeper support will be dropped in Strimzi 0.46. So I think this is at this point in time pointless for ZooKeeper based clusters and should be considered only for KRaft-based clusters?