rabbitmq / rabbitmq-server

Open source RabbitMQ: core server and tier 1 (built-in) plugins
https://www.rabbitmq.com/
Other
12.3k stars 3.91k forks source link

Feature flags need quality of life improvements #9677

Closed dumbbell closed 1 month ago

dumbbell commented 1 year ago

Why

Since the introduction of the first required feature flags, it became more painful for users to upgrade if they did not pay attention to the feature flags states. Things like:

There is room for improvement in the current subsystem and I would like to follow several routes:

  1. better communicate from RabbitMQ itself that users have to enable feature flags
  2. make changes to the subsystem to handle common situations that are problematic today
  3. prevent foot-shooting when upgrading RabbitMQ using our packages (Debian and RPM)

How

Here is a list of improvements that I plan to make:

wast commented 1 year ago

I'd like to add my 2 cents. There's an error when using 1 replica in a RabbitMQ Cluster Kubernetes operator: "Feature flags: refuse to enable feature flags while clustered nodes are missing, stopped or unreachable"

dumbbell commented 1 year ago

I'd like to add my 2 cents. There's an error when using 1 replica in a RabbitMQ Cluster Kubernetes operator: "Feature flags: refuse to enable feature flags while clustered nodes are missing, stopped or unreachable"

All RabbitMQ nodes in a cluster need to run before a feature flag can be enabled. Could you please expand on your use case?

wast commented 1 year ago

"All nodes" in my scenario is 1 single node (as defined in yaml: replicas: 1), so why is it expecting more?

michaelklishin commented 1 year ago

@wast please start a separate GitHub Discussion, we will not let well defined issues to be turned into open ended discussions and troubleshooting sessions.

michaelklishin commented 1 year ago

"All nodes" in my scenario is 1 single node (as defined in yaml: replicas: 1), so why is it expecting more?

Most likely because there were more nodes in the cluster at some point and existing nodes still have knowledge of their prior peers. The Cluster Operator does not support shrinking the cluster, at least not in all cases, IIRC. There is a certain workaround but in general, shrinking member count should not be considered a supported operation.

This is a topic for a separate discussion, this issue has well defined and specific scope.

dumbbell commented 1 month ago

This issue was "converted" to a GitHub project: https://github.com/orgs/rabbitmq/projects/4