bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
9.03k stars 9.22k forks source link

[bitnami/rabbitmq] Cluster doesn't work. Management UI shows only one node. #28133

Closed matsudayoji closed 2 months ago

matsudayoji commented 4 months ago

Name and Version

bitnami/rabbitmq 14.5.0

What architecture are you using?

None

What steps will reproduce the bug?

Clean install (removed services, pvc etc) After deploy with values.yaml to k8s only one node is displayed in the management ui. It changes from rabbitmq-0 to rabbitmq-1 and back. Tried 12.5.6 chart - same behaviour.

Are you using any custom parameters or values?

values.yaml

replicaCount: 2

auth:
  user:
    username: my_user
    password: my_user
  admin:
    username: admin
    password: admin
  erlangCookie: 9TVM5GXANJ6XG4S9N55X
  password: "null"

rbac:
  create: false

clustering:
  enabled: true
  rebalance: true
  forceBoot: false

clusterDomain: my-k8s-domain.local

serviceAccount:
  create: true
  name: my-rabbit

loadDefinition:
  enabled: true
  existingSecret: my-test-rmq-load-definition

extraConfiguration: |-
  load_definitions = /app/load_definition.json
  prometheus.return_per_object_metrics = true
  vm_memory_high_watermark.relative = 0.7
  total_memory_available_override_value = 1GB

extraSecrets:
  my-test-rmq-load-definition:
    load_definition.json: |
      {
        "users": [
          {
            "name": "{{.Values.auth.user.username}}",
            "password": "{{.Values.auth.user.password}}",
            "tags": ""
          },
          {
            "name": "{{.Values.auth.admin.username}}",
            "password": "{{.Values.auth.admin.password}}",
            "tags": "administrator"
          }
        ],
        "vhosts": [
          {
            "name": "/"
          }
        ],
        "permissions": [
          {
            "user": "{{.Values.auth.user.username}}",
            "vhost": "/",
            "configure": ".*",
            "write": ".*",
            "read": ".*"
          },
          {
            "user": "{{.Values.auth.admin.username}}",
            "vhost": "/",
            "configure": ".*",
            "write": ".*",
            "read": ".*"
          }
        ],
        "policies": [
          {
            "name": "HA",
            "pattern": ".*",
            "vhost": "/",
            "definition": {
              "ha-mode": "all",
              "ha-sync-mode": "automatic"
            }
          }
        ]
      }

metrics:
  enabled: true

  serviceMonitor:
    enabled: true
    interval: 10s

  prometheusRule:
    enabled: true
    rules:
      - alert: RabbitmqNodeDown
        expr: |
          sum(rabbitmq_build_info{service="{{ template "common.names.fullname" . }}"}) < 2
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: Rabbitmq node down (instance {{ "{{ $labels.instance }}" }})
          description: |
            Less than 2 nodes running in RabbitMQ cluster
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqNodeNotDistributed
        expr: erlang_vm_dist_node_state{service="{{ template "common.names.fullname" . }}"} < 3
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: Rabbitmq node not distributed (instance {{ "{{ $labels.instance }}" }})
          description: |
            Distribution link state is not 'up'
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqInstancesDifferentVersions
        expr: |
          count(count(rabbitmq_build_info{service="{{ template "common.names.fullname" . }}"})
          by (rabbitmq_version)) > 1
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: |
            Rabbitmq instance different versions (instance {{ "{{ $labels.instance }}" }})
          description: |
            Running different version of Rabbitmq in the same cluster, can lead to failure.
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqMemoryHigh
        expr: |
          rabbitmq_process_resident_memory_bytes{service="{{ template "common.names.fullname" . }}"}
          / rabbitmq_resident_memory_limit_bytes{service="{{ template "common.names.fullname" . }}"}
          * 100 > 90
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: Rabbitmq memory high (instance {{ "{{ $labels.instance }}" }})
          description: |
            A node use more than 90% of allocated RAM\
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqFileDescriptorsUsage
        expr: |
          rabbitmq_process_open_fds{service="{{ template "common.names.fullname" . }}"}
          / rabbitmq_process_max_fds{service="{{ template "common.names.fullname" . }}"}
          * 100 > 90
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: |
            Rabbitmq file descriptors usage (instance {{ "{{ $labels.instance }}" }})
          description: |
            A node use more than 90% of file descriptors
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqTooMuchUnack
        expr: |
          sum(rabbitmq_queue_messages_unacked{service="{{ template "common.names.fullname" . }}"})
          BY (queue) > 1000
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: |
            Rabbitmq too much unack (instance {{ "{{ $labels.instance }}" }})
          description: |
            Too much unacknowledged messages
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqTooMuchConnections
        expr: rabbitmq_connections{service="{{ template "common.names.fullname" . }}"} > 1000
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: Rabbitmq too much connections (instance {{ "{{ $labels.instance }}" }})
          description: |
            The total connections of a node is too high
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqNoQueueConsumer
        expr: rabbitmq_queue_consumers{service="{{ template "common.names.fullname" . }}"} < 1
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: Rabbitmq no queue consumer (instance {{ "{{ $labels.instance }}" }})
          description: |
            A queue has less than 1 consumer
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}
      - alert: RabbitmqUnroutableMessages
        expr: |
          increase(rabbitmq_channel_messages_unroutable_returned_total{service="{{ template "common.names.fullname" . }}"}[1m])
          > 0 or increase(rabbitmq_channel_messages_unroutable_dropped_total{service="{{ template "common.names.fullname" . }}"}[1m]) > 0
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: Rabbitmq unroutable messages (instance {{ "{{ $labels.instance }}" }})
          description: |
            A queue has unroutable messages
            VALUE = {{ "{{ $value }}" }}\n  LABELS: {{ "{{ $labels }}" }}

resources:
  limits:
    cpu: 1
    memory: 1Gi
  requests:
    cpu: 0.5
    memory: 1Gi

persistence:
  enabled: true
  size: 4Gi

ingress:
  enabled: true
  hostname: my-test-rabbitmq.my-name-space.my-k8s-domain.it.local

What is the expected behavior?

Cluster works as expected with 2 nodes

What do you see instead?

log

rabbitmq 07:21:09.49 INFO  ==> 
rabbitmq 07:21:09.49 INFO  ==> Welcome to the Bitnami rabbitmq container
rabbitmq 07:21:09.50 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
rabbitmq 07:21:09.50 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
rabbitmq 07:21:09.50 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
rabbitmq 07:21:09.50 INFO  ==> 
rabbitmq 07:21:09.51 INFO  ==> ** Starting RabbitMQ setup **
rabbitmq 07:21:09.53 INFO  ==> Validating settings in RABBITMQ_* env vars..
rabbitmq 07:21:09.54 WARN  ==> A definition file with "users" was found. The RABBITMQ_SECURE_PASSWORD environment variables will be ignored.
rabbitmq 07:21:09.55 INFO  ==> Initializing RabbitMQ...
rabbitmq 07:21:09.58 INFO  ==> Starting RabbitMQ in background...
rabbitmq 07:21:48.98 INFO  ==> No custom scripts in /docker-entrypoint-initdb.d
rabbitmq 07:21:48.99 INFO  ==> Stopping RabbitMQ...
rabbitmq 07:21:51.95 INFO  ==> ** RabbitMQ setup finished! **

rabbitmq 07:21:51.97 INFO  ==> ** Starting RabbitMQ **
=INFO REPORT==== 17-Jul-2024::07:21:52.756929 ===
    alarm_handler: {set,{{disk_almost_full,"/"},[]}}
=INFO REPORT==== 17-Jul-2024::07:21:52.762203 ===
    alarm_handler: {set,{{disk_almost_full,"/etc/hostname"},[]}}
Error: unable to perform an operation on node 'rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: ['rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local']

rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local:
  * connected to epmd (port 4369) on my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
  * epmd reports: node 'rabbit' not running at all
                  no other nodes on my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
  * suggestion: start the node

Current node details:
 * node name: 'rabbitmqcli-477-rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local'
 * effective user's home directory: /opt/bitnami/rabbitmq/.rabbitmq
 * Erlang cookie hash: CN0c8LnmXgEzk+DkB7w60Q==

Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
2024-07-17 07:21:58.162004+00:00 [notice] <0.44.0> Application syslog exited with reason: stopped
2024-07-17 07:21:58.166072+00:00 [notice] <0.254.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2024-07-17 07:21:58.166578+00:00 [notice] <0.254.0> Logging: configured log handlers are now ACTIVE
2024-07-17 07:21:58.237121+00:00 [info] <0.254.0> ra: starting system quorum_queues
2024-07-17 07:21:58.237218+00:00 [info] <0.254.0> starting Ra system: quorum_queues in directory: /opt/bitnami/rabbitmq/.rabbitmq/mnesia/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local/quorum/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
2024-07-17 07:21:58.346737+00:00 [info] <0.276.0> ra system 'quorum_queues' running pre init for 0 registered servers
2024-07-17 07:21:58.353969+00:00 [info] <0.277.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
2024-07-17 07:21:58.363448+00:00 [notice] <0.282.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
2024-07-17 07:21:58.443512+00:00 [info] <0.254.0> ra: starting system coordination
2024-07-17 07:21:58.443579+00:00 [info] <0.254.0> starting Ra system: coordination in directory: /opt/bitnami/rabbitmq/.rabbitmq/mnesia/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local/coordination/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
2024-07-17 07:21:58.444649+00:00 [info] <0.290.0> ra system 'coordination' running pre init for 1 registered servers
2024-07-17 07:21:58.451115+00:00 [info] <0.291.0> ra: meta data store initialised for system coordination. 1 record(s) recovered
2024-07-17 07:21:58.451298+00:00 [notice] <0.296.0> WAL: ra_coordination_log_wal init, open tbls: ra_coordination_log_open_mem_tables, closed tbls: ra_coordination_log_closed_mem_tables
2024-07-17 07:21:58.456207+00:00 [info] <0.254.0> ra: starting system coordination
2024-07-17 07:21:58.456253+00:00 [info] <0.254.0> starting Ra system: coordination in directory: /opt/bitnami/rabbitmq/.rabbitmq/mnesia/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local/coordination/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
2024-07-17 07:21:58.646982+00:00 [info] <0.254.0> Waiting for Khepri leader for 30000 ms, 9 retries left
2024-07-17 07:21:58.664809+00:00 [info] <0.254.0> Khepri leader elected
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
2024-07-17 07:21:59.164120+00:00 [notice] <0.301.0> RabbitMQ metadata store: candidate -> leader in term: 2 machine version: 0
2024-07-17 07:21:59.937991+00:00 [info] <0.254.0> 
2024-07-17 07:21:59.937991+00:00 [info] <0.254.0>  Starting RabbitMQ 3.13.4 on Erlang 26.2.5 [jit]
2024-07-17 07:21:59.937991+00:00 [info] <0.254.0>  Copyright (c) 2007-2024 Broadcom Inc and/or its subsidiaries
2024-07-17 07:21:59.937991+00:00 [info] <0.254.0>  Licensed under the MPL 2.0. Website: https://rabbitmq.com

  ##  ##      RabbitMQ 3.13.4
  ##  ##
  ##########  Copyright (c) 2007-2024 Broadcom Inc and/or its subsidiaries
  ######  ##
  ##########  Licensed under the MPL 2.0. Website: https://rabbitmq.com

  Erlang:      26.2.5 [jit]
  TLS Library: OpenSSL - OpenSSL 3.0.13 30 Jan 2024
  Release series support status: see https://www.rabbitmq.com/release-information

  Doc guides:  https://www.rabbitmq.com/docs
  Support:     https://www.rabbitmq.com/docs/contact
  Tutorials:   https://www.rabbitmq.com/tutorials
  Monitoring:  https://www.rabbitmq.com/docs/monitoring
  Upgrading:   https://www.rabbitmq.com/docs/upgrade

  Logs: <stdout>

  Config file(s): /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf

  Starting broker...2024-07-17 07:21:59.939331+00:00 [info] <0.254.0> 
2024-07-17 07:21:59.939331+00:00 [info] <0.254.0>  node           : rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
2024-07-17 07:21:59.939331+00:00 [info] <0.254.0>  home dir       : /opt/bitnami/rabbitmq/.rabbitmq
2024-07-17 07:21:59.939331+00:00 [info] <0.254.0>  config file(s) : /opt/bitnami/rabbitmq/etc/rabbitmq/rabbitmq.conf
2024-07-17 07:21:59.939331+00:00 [info] <0.254.0>  cookie hash    : CN0c8LnmXgEzk+DkB7w60Q==
2024-07-17 07:21:59.939331+00:00 [info] <0.254.0>  log(s)         : <stdout>
2024-07-17 07:21:59.939331+00:00 [info] <0.254.0>  data dir       : /opt/bitnami/rabbitmq/.rabbitmq/mnesia/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
2024-07-17 07:22:01.049576+00:00 [info] <0.254.0> Running boot step pre_boot defined by app rabbit
2024-07-17 07:22:01.049648+00:00 [info] <0.254.0> Running boot step rabbit_global_counters defined by app rabbit
2024-07-17 07:22:01.049871+00:00 [info] <0.254.0> Running boot step rabbit_osiris_metrics defined by app rabbit
2024-07-17 07:22:01.049949+00:00 [info] <0.254.0> Running boot step rabbit_core_metrics defined by app rabbit
2024-07-17 07:22:01.050909+00:00 [info] <0.254.0> Running boot step rabbit_alarm defined by app rabbit
2024-07-17 07:22:01.062411+00:00 [info] <0.432.0> Memory high watermark set to 716 MiB (751619276 bytes) of 1024 MiB (1073741824 bytes) total
2024-07-17 07:22:01.064981+00:00 [info] <0.434.0> Enabling free disk space monitoring (disk free space: 3911847936, total memory: 1073741824)
2024-07-17 07:22:01.065023+00:00 [info] <0.434.0> Disk free limit set to 50MB
2024-07-17 07:22:01.066634+00:00 [info] <0.254.0> Running boot step code_server_cache defined by app rabbit
2024-07-17 07:22:01.066722+00:00 [info] <0.254.0> Running boot step file_handle_cache defined by app rabbit
2024-07-17 07:22:01.136769+00:00 [info] <0.437.0> Limiting to approx 65438 file handles (58892 sockets)
2024-07-17 07:22:01.136976+00:00 [info] <0.438.0> FHC read buffering: OFF
2024-07-17 07:22:01.137018+00:00 [info] <0.438.0> FHC write buffering: ON
2024-07-17 07:22:01.137323+00:00 [info] <0.254.0> Running boot step worker_pool defined by app rabbit
2024-07-17 07:22:01.137395+00:00 [info] <0.359.0> Will use 16 processes for default worker pool
2024-07-17 07:22:01.137425+00:00 [info] <0.359.0> Starting worker pool 'worker_pool' with 16 processes in it
2024-07-17 07:22:01.137945+00:00 [info] <0.254.0> Running boot step database defined by app rabbit
2024-07-17 07:22:01.138341+00:00 [info] <0.254.0> Peer discovery: configured backend: rabbit_peer_discovery_k8s
2024-07-17 07:22:01.139419+00:00 [info] <0.254.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2024-07-17 07:22:01.142102+00:00 [info] <0.254.0> Successfully synced tables from a peer
2024-07-17 07:22:01.149607+00:00 [info] <0.254.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2024-07-17 07:22:01.149739+00:00 [info] <0.254.0> Successfully synced tables from a peer
2024-07-17 07:22:01.149861+00:00 [info] <0.254.0> Peer discovery: will register with peer discovery backend rabbit_peer_discovery_k8s
2024-07-17 07:22:01.766996+00:00 [info] <0.254.0> Running boot step tracking_metadata_store defined by app rabbit
2024-07-17 07:22:01.767145+00:00 [info] <0.470.0> Setting up a table for connection tracking on this node: tracked_connection
2024-07-17 07:22:01.767209+00:00 [info] <0.470.0> Setting up a table for per-vhost connection counting on this node: tracked_connection_per_vhost
2024-07-17 07:22:01.767299+00:00 [info] <0.470.0> Setting up a table for per-user connection counting on this node: tracked_connection_per_user
2024-07-17 07:22:01.767334+00:00 [info] <0.470.0> Setting up a table for channel tracking on this node: tracked_channel
2024-07-17 07:22:01.767391+00:00 [info] <0.470.0> Setting up a table for channel tracking on this node: tracked_channel_per_user
2024-07-17 07:22:01.767458+00:00 [info] <0.254.0> Running boot step networking_metadata_store defined by app rabbit
2024-07-17 07:22:01.767512+00:00 [info] <0.254.0> Running boot step feature_flags defined by app rabbit
2024-07-17 07:22:01.767640+00:00 [info] <0.254.0> Running boot step codec_correctness_check defined by app rabbit
2024-07-17 07:22:01.767665+00:00 [info] <0.254.0> Running boot step external_infrastructure defined by app rabbit
2024-07-17 07:22:01.767716+00:00 [info] <0.254.0> Running boot step rabbit_event defined by app rabbit
2024-07-17 07:22:01.767814+00:00 [info] <0.254.0> Running boot step rabbit_registry defined by app rabbit
2024-07-17 07:22:01.767870+00:00 [info] <0.254.0> Running boot step rabbit_auth_mechanism_amqplain defined by app rabbit
2024-07-17 07:22:01.767921+00:00 [info] <0.254.0> Running boot step rabbit_auth_mechanism_cr_demo defined by app rabbit
2024-07-17 07:22:01.767954+00:00 [info] <0.254.0> Running boot step rabbit_auth_mechanism_plain defined by app rabbit
2024-07-17 07:22:01.767990+00:00 [info] <0.254.0> Running boot step rabbit_exchange_type_direct defined by app rabbit
2024-07-17 07:22:01.768034+00:00 [info] <0.254.0> Running boot step rabbit_exchange_type_fanout defined by app rabbit
2024-07-17 07:22:01.768068+00:00 [info] <0.254.0> Running boot step rabbit_exchange_type_headers defined by app rabbit
2024-07-17 07:22:01.768105+00:00 [info] <0.254.0> Running boot step rabbit_exchange_type_topic defined by app rabbit
2024-07-17 07:22:01.768140+00:00 [info] <0.254.0> Running boot step rabbit_mirror_queue_mode_all defined by app rabbit
2024-07-17 07:22:01.768175+00:00 [info] <0.254.0> Running boot step rabbit_mirror_queue_mode_exactly defined by app rabbit
2024-07-17 07:22:01.768244+00:00 [info] <0.254.0> Running boot step rabbit_mirror_queue_mode_nodes defined by app rabbit
2024-07-17 07:22:01.768316+00:00 [info] <0.254.0> Running boot step rabbit_priority_queue defined by app rabbit
2024-07-17 07:22:01.768348+00:00 [info] <0.254.0> Priority queues enabled, real BQ is rabbit_variable_queue
2024-07-17 07:22:01.768434+00:00 [info] <0.254.0> Running boot step rabbit_queue_location_client_local defined by app rabbit
2024-07-17 07:22:01.768581+00:00 [info] <0.254.0> Running boot step rabbit_queue_location_min_masters defined by app rabbit
2024-07-17 07:22:01.768653+00:00 [info] <0.254.0> Running boot step rabbit_queue_location_random defined by app rabbit
2024-07-17 07:22:01.768706+00:00 [info] <0.254.0> Running boot step kernel_ready defined by app rabbit
2024-07-17 07:22:01.768733+00:00 [info] <0.254.0> Running boot step ldap_pool defined by app rabbitmq_auth_backend_ldap
2024-07-17 07:22:01.768784+00:00 [info] <0.359.0> Starting worker pool 'ldap_pool' with 64 processes in it
2024-07-17 07:22:01.770736+00:00 [info] <0.254.0> Running boot step rabbit_sysmon_minder defined by app rabbit
2024-07-17 07:22:01.770870+00:00 [info] <0.254.0> Running boot step rabbit_epmd_monitor defined by app rabbit
2024-07-17 07:22:01.771826+00:00 [info] <0.544.0> epmd monitor knows us, inter-node communication (distribution) port: 25672
2024-07-17 07:22:01.771912+00:00 [info] <0.254.0> Running boot step guid_generator defined by app rabbit
2024-07-17 07:22:01.778316+00:00 [info] <0.254.0> Running boot step rabbit_node_monitor defined by app rabbit
2024-07-17 07:22:01.778566+00:00 [info] <0.548.0> Starting rabbit_node_monitor (in autoheal mode)
2024-07-17 07:22:01.778657+00:00 [info] <0.254.0> Running boot step delegate_sup defined by app rabbit
2024-07-17 07:22:01.778995+00:00 [info] <0.254.0> Running boot step rabbit_memory_monitor defined by app rabbit
2024-07-17 07:22:01.779143+00:00 [info] <0.254.0> Running boot step rabbit_fifo_dlx_sup defined by app rabbit
2024-07-17 07:22:01.779197+00:00 [info] <0.254.0> Running boot step core_initialized defined by app rabbit
2024-07-17 07:22:01.779219+00:00 [info] <0.254.0> Running boot step rabbit_channel_tracking_handler defined by app rabbit
2024-07-17 07:22:01.835046+00:00 [info] <0.254.0> Running boot step rabbit_connection_tracking_handler defined by app rabbit
2024-07-17 07:22:01.835141+00:00 [info] <0.254.0> Running boot step rabbit_definitions_hashing defined by app rabbit
2024-07-17 07:22:01.835206+00:00 [info] <0.254.0> Running boot step rabbit_exchange_parameters defined by app rabbit
2024-07-17 07:22:01.845394+00:00 [info] <0.254.0> Running boot step rabbit_mirror_queue_misc defined by app rabbit
2024-07-17 07:22:01.845793+00:00 [info] <0.254.0> Running boot step rabbit_policies defined by app rabbit
2024-07-17 07:22:01.846052+00:00 [info] <0.254.0> Running boot step rabbit_policy defined by app rabbit
2024-07-17 07:22:01.846100+00:00 [info] <0.254.0> Running boot step rabbit_queue_location_validator defined by app rabbit
2024-07-17 07:22:01.846156+00:00 [info] <0.254.0> Running boot step rabbit_quorum_memory_manager defined by app rabbit
2024-07-17 07:22:01.846201+00:00 [info] <0.254.0> Running boot step rabbit_quorum_queue defined by app rabbit
2024-07-17 07:22:01.846278+00:00 [info] <0.254.0> Running boot step rabbit_stream_coordinator defined by app rabbit
2024-07-17 07:22:01.846350+00:00 [info] <0.254.0> Running boot step rabbit_vhost_limit defined by app rabbit
2024-07-17 07:22:01.846399+00:00 [info] <0.254.0> Running boot step rabbit_federation_parameters defined by app rabbitmq_federation
2024-07-17 07:22:01.846479+00:00 [info] <0.254.0> Running boot step rabbit_federation_supervisor defined by app rabbitmq_federation
2024-07-17 07:22:01.852418+00:00 [info] <0.254.0> Running boot step rabbit_federation_queue defined by app rabbitmq_federation
2024-07-17 07:22:01.852569+00:00 [info] <0.254.0> Running boot step rabbit_federation_upstream_exchange defined by app rabbitmq_federation
2024-07-17 07:22:01.852636+00:00 [info] <0.254.0> Running boot step rabbit_mgmt_reset_handler defined by app rabbitmq_management
2024-07-17 07:22:01.852679+00:00 [info] <0.254.0> Running boot step rabbit_mgmt_db_handler defined by app rabbitmq_management_agent
2024-07-17 07:22:01.852735+00:00 [info] <0.254.0> Management plugin: using rates mode 'basic'
2024-07-17 07:22:01.852927+00:00 [info] <0.254.0> Running boot step recovery defined by app rabbit
2024-07-17 07:22:01.863854+00:00 [info] <0.598.0> Making sure data directory '/opt/bitnami/rabbitmq/.rabbitmq/mnesia/rabbit@my-test-rabbit-rabbitmq-0.my-test-rabbit-rabbitmq-headless.my-name-space.svc.my-k8s-domain.local/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L' for vhost '/' exists
2024-07-17 07:22:01.939954+00:00 [info] <0.598.0> Starting message stores for vhost '/'
2024-07-17 07:22:01.940308+00:00 [info] <0.607.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_transient": using rabbit_msg_store_ets_index to provide index
2024-07-17 07:22:01.942558+00:00 [info] <0.598.0> Started message store of type transient for vhost '/'
2024-07-17 07:22:01.942767+00:00 [info] <0.611.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": using rabbit_msg_store_ets_index to provide index
2024-07-17 07:22:01.944287+00:00 [info] <0.598.0> Started message store of type persistent for vhost '/'
2024-07-17 07:22:01.944436+00:00 [info] <0.598.0> Recovering 0 queues of type rabbit_classic_queue took 79ms
2024-07-17 07:22:01.944480+00:00 [info] <0.598.0> Recovering 0 queues of type rabbit_quorum_queue took 0ms
2024-07-17 07:22:01.944518+00:00 [info] <0.598.0> Recovering 0 queues of type rabbit_stream_queue took 0ms
2024-07-17 07:22:01.946110+00:00 [info] <0.254.0> Running boot step empty_db_check defined by app rabbit
2024-07-17 07:22:01.946173+00:00 [info] <0.254.0> Will not seed default virtual host and user: have definitions to load...
2024-07-17 07:22:01.946214+00:00 [info] <0.254.0> Running boot step rabbit_observer_cli defined by app rabbit
2024-07-17 07:22:01.946295+00:00 [info] <0.254.0> Running boot step rabbit_looking_glass defined by app rabbit
2024-07-17 07:22:01.946320+00:00 [info] <0.254.0> Running boot step rabbit_core_metrics_gc defined by app rabbit
2024-07-17 07:22:01.946453+00:00 [info] <0.254.0> Running boot step background_gc defined by app rabbit
2024-07-17 07:22:01.946560+00:00 [info] <0.254.0> Running boot step routing_ready defined by app rabbit
2024-07-17 07:22:01.946585+00:00 [info] <0.254.0> Running boot step pre_flight defined by app rabbit
2024-07-17 07:22:01.946619+00:00 [info] <0.254.0> Running boot step notify_cluster defined by app rabbit
2024-07-17 07:22:01.946655+00:00 [info] <0.254.0> Running boot step networking defined by app rabbit
2024-07-17 07:22:01.946775+00:00 [info] <0.254.0> Running boot step rabbit_quorum_queue_periodic_membership_reconciliation defined by app rabbit
2024-07-17 07:22:01.947060+00:00 [info] <0.254.0> Running boot step definition_import_worker_pool defined by app rabbit
2024-07-17 07:22:01.947116+00:00 [info] <0.359.0> Starting worker pool 'definition_import_pool' with 16 processes in it
2024-07-17 07:22:01.947841+00:00 [info] <0.254.0> Running boot step cluster_name defined by app rabbit
2024-07-17 07:22:01.947893+00:00 [info] <0.254.0> Setting cluster name to 'my-test-rabbit-rabbitmq' as configured
2024-07-17 07:22:01.950915+00:00 [info] <0.254.0> Running boot step virtual_host_reconciliation defined by app rabbit
2024-07-17 07:22:01.951157+00:00 [info] <0.254.0> Running boot step direct_client defined by app rabbit
2024-07-17 07:22:01.951274+00:00 [info] <0.254.0> Running boot step rabbit_federation_exchange defined by app rabbitmq_federation
2024-07-17 07:22:01.951427+00:00 [info] <0.254.0> Running boot step rabbit_management_load_definitions defined by app rabbitmq_management
2024-07-17 07:22:01.951548+00:00 [info] <0.662.0> Resetting node maintenance status
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0> Deprecated features: `management_metrics_collection`: Feature `management_metrics_collection` is deprecated.
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0> By default, this feature can still be used for now.
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0> Its use will not be permitted by default in a future minor RabbitMQ version and the feature will be removed from a future major RabbitMQ version; actual versions to be determined.
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0> To continue using this feature when it is not permitted by default, set the following parameter in your configuration:
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0>     "deprecated_features.permit.management_metrics_collection = true"
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0> To test RabbitMQ as if the feature was removed, set this in your configuration:
2024-07-17 07:22:02.235658+00:00 [warning] <0.690.0>     "deprecated_features.permit.management_metrics_collection = false"
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
Error: this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'.
Arguments given:
    cluster_status

Usage

rabbitmqctl [--node <node>] [--longnames] [--quiet] cluster_status [--timeout <timeout>]
2024-07-17 07:22:09.558836+00:00 [info] <0.727.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2024-07-17 07:22:09.558989+00:00 [info] <0.777.0> Statistics database started.
2024-07-17 07:22:09.559069+00:00 [info] <0.776.0> Starting worker pool 'management_worker_pool' with 3 processes in it
2024-07-17 07:22:09.566452+00:00 [info] <0.792.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 10 seconds.
2024-07-17 07:22:09.566927+00:00 [warning] <0.803.0> LDAP plugin loaded, but rabbit_auth_backend_ldap is not in the list of auth_backends. LDAP auth will not work.
2024-07-17 07:22:09.570173+00:00 [info] <0.811.0> Prometheus metrics: HTTP (non-TLS) listener started on port 9419
2024-07-17 07:22:09.570551+00:00 [info] <0.662.0> Applying definitions from regular file at /app/load_definition.json
2024-07-17 07:22:09.570854+00:00 [info] <0.662.0> Applying definitions from file at '/app/load_definition.json'
2024-07-17 07:22:09.570900+00:00 [info] <0.662.0> Asked to import definitions. Acting user: rmq-internal
2024-07-17 07:22:09.571018+00:00 [info] <0.662.0> Importing concurrently 2 users...
2024-07-17 07:22:09.573798+00:00 [info] <0.644.0> Successfully changed password for user 'my_user'
2024-07-17 07:22:09.573912+00:00 [info] <0.645.0> Successfully changed password for user 'admin'
2024-07-17 07:22:09.573873+00:00 [info] <0.644.0> Successfully set user tags for user 'my_user' to []
2024-07-17 07:22:09.573951+00:00 [info] <0.645.0> Successfully set user tags for user 'admin' to [administrator]
2024-07-17 07:22:09.574126+00:00 [info] <0.662.0> Importing concurrently 1 vhosts...
2024-07-17 07:22:09.637574+00:00 [info] <0.662.0> Importing concurrently 2 permissions...
2024-07-17 07:22:09.641923+00:00 [info] <0.644.0> Successfully set permissions for user 'my_user' in virtual host '/' to '.*', '.*', '.*'
2024-07-17 07:22:09.642060+00:00 [info] <0.645.0> Successfully set permissions for user 'admin' in virtual host '/' to '.*', '.*', '.*'
2024-07-17 07:22:09.642181+00:00 [info] <0.662.0> Importing sequentially 1 policies...
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0> Deprecated features: `classic_queue_mirroring`: Classic mirrored queues are deprecated.
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0> By default, they can still be used for now.
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0> Their use will not be permitted by default in the next minorRabbitMQ version (if any) and they will be removed from RabbitMQ 4.0.0.
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0> To continue using classic mirrored queues when they are not permitted by default, set the following parameter in your configuration:
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0>     "deprecated_features.permit.classic_queue_mirroring = true"
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0> To test RabbitMQ as if they were removed, set this in your configuration:
2024-07-17 07:22:09.642824+00:00 [warning] <0.662.0>     "deprecated_features.permit.classic_queue_mirroring = false"
2024-07-17 07:22:09.650283+00:00 [info] <0.662.0> There are fewer than target cluster size (2) nodes online, skipping queue and binding import from definitions
2024-07-17 07:22:09.650366+00:00 [info] <0.662.0> Ready to start client connection listeners
2024-07-17 07:22:09.656703+00:00 [info] <0.869.0> started TCP listener on [::]:5672
 completed with 8 plugins.
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0> Server startup complete; 8 plugins started.
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_prometheus
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_federation
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_auth_backend_ldap
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_peer_discovery_k8s
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_peer_discovery_common
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_management
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_management_agent
2024-07-17 07:22:09.845431+00:00 [info] <0.662.0>  * rabbitmq_web_dispatch
2024-07-17 07:22:10.040353+00:00 [info] <0.9.0> Time to start RabbitMQ: 17488 ms
Re-balancing leaders of all queues...
2024-07-17 07:22:11.244336+00:00 [info] <0.903.0> Starting queue rebalance operation: 'all' for vhosts matching '.*' and queues matching '.*'
2024-07-17 07:22:11.244692+00:00 [info] <0.903.0> All queue leaders are balanced
2024-07-17 07:22:11.244802+00:00 [info] <0.903.0> Finished queue rebalance operation
(0lqqqqqqqqk(B
(0x(B Output (0x(B
(0tqqqqqqqqu(B
(0x(B []     (0x(B
(0mqqqqqqqqj(B
rabbitmq 07:22:11.27 INFO  ==> Cluster rebalanced successfully

Additional information

No response

rafariossaa commented 3 months ago

Hi, In the configuration file you provide, you are setting the admin password in a secret that is later used in by the loaddefinition and alse setting the password to "null". So, my guess is that the initialization scripts are not aware of the password set by the secret + loaddefinition.

matsudayoji commented 3 months ago

@rafariossaa I removed the password, but both nodes did not work as single cluster, same behaviour.

jotamartos commented 3 months ago

Hi,

I simplified the issue to the minimum and simply deployed the latest version of the chart using replicaCount=2

diff --git a/bitnami/rabbitmq/values.yaml b/bitnami/rabbitmq/values.yaml
index ca3a283e12..87a3fa0df9 100644
--- a/bitnami/rabbitmq/values.yaml
+++ b/bitnami/rabbitmq/values.yaml
@@ -646,7 +646,7 @@ extraSecretsPrependReleaseName: false

 ## @param replicaCount Number of RabbitMQ replicas to deploy
 ##
-replicaCount: 1
+replicaCount: 2
 ## @param schedulerName Use an alternate scheduler, e.g. "stork".
 ## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
 ##

Once the pods were ready, I could access the UI and confirmed that both nodes appeared there. Could you please try this approach and add the rest of the changes little by little until you find the root cause of the issue?

Screenshot 2024-08-02 at 12 03 17

github-actions[bot] commented 3 months ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 2 months ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

junminahn commented 2 months ago

I'm encountering the same issue, even with the latest Helm chart and replicaCount set to 2.