Closed Patrick-Remy closed 2 years ago
Hello,
I just tested the provided Dockerfile and everything seems to work fine.
## ## RabbitMQ 3.8.10
## ##
########## Copyright (c) 2007-2020 VMware, Inc. or its affiliates.
###### ##
########## Licensed under the MPL 2.0. Website: https://rabbitmq.com
Doc guides: https://rabbitmq.com/documentation.html
Support: https://rabbitmq.com/contact.html
Tutorials: https://rabbitmq.com/getstarted.html
Monitoring: https://rabbitmq.com/monitoring.html
...
2021-08-15 20:45:52.673 [info] <0.745.0> Server startup complete; 6 plugins started.
* rabbitmq_prometheus
* rabbitmq_delayed_message_exchange
* rabbitmq_message_deduplication
* rabbitmq_management
* rabbitmq_web_dispatch
* rabbitmq_management_agent
Same with 3.8.9
and 3.8.12
. Broker boots fine and I can create deduplication exchanges and queues.
It seems that it only crashes, if I enable also the plugin delayed_message_exchange
, with a configured exchange:
[
'type' => 'x-delayed-message',
'arguments' => [
'x-delayed-type' => 'x-message-deduplication',
'x-cache-size' => 10000
]
],
I am afraid the above is not well supported as it expects the two plugins to be started in the correct sequence. RabbitMQ has been recently changing the plugins start order and this might be the core problem leading to the issue.
I would recommend you to create a delayed exchange and a deduplication one separately and then bind them one to the other. In this way, you can achieve the same result without risking to suffer from plugins started in the wrong order.
You can see some discussions related to the plugins boot order messup in here: https://groups.google.com/g/rabbitmq-users/c/m7jgGmIk4oY/m/4tmLsVM0AwAJ
The commit you might be affected from is likely the following: https://github.com/rabbitmq/rabbitmq-server/commit/a0cd2e5fd033a7490e97ee46ea42b7b3581c3629
Thanks for your fast investigation, and response. I'll have to try your suggestion!
I put together a build which should be solving your issue. Would you mind giving it a try? You will need Erlang 24.
Yes, it works perfectly (for RabbitMQ >=3.8.16)! Many thanks for that!
Is there a roadmap for the next bugfix release, that includes this fix?
Any updates for a next release build?
Hello,
I am very busy atm. I will have more time from the beginning of November. You can expect the next release around mid-November.
Release 0.5.1
fixes this issue. Please re-open this ticket otherwise.
Again I am unfortunately not sure if it's related to this plugin, sorry :/
But with the new setup the „auto migration” by just recreating the container works well. But if I manually stop one node and try to start it again, rabbitmq now starts consuming multiple GB of RAM (like going in an endless loop). I posted the full debug log here:
https://pastebin.com/pfj3XfNB MPCVbDbfT6
Dockerfile
FROM rabbitmq:3.9.9-management-alpine AS production
RUN apk --no-cache add tzdata
# See https://github.com/noxdafox/rabbitmq-message-deduplication/releases
ARG PLUGIN_ELIXIR_VERSION=1.12.2
ARG PLUGIN_DEDUPLICATION_VERSION=0.5.1
# See https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/releases
ARG PLUGIN_DELAYED_MESSAGES_VERSION=3.9.0
# Install and enable plugins
RUN wget -P $RABBITMQ_HOME/plugins/ https://github.com/noxdafox/rabbitmq-message-deduplication/releases/download/${PLUGIN_DEDUPLICATION_VERSION}/elixir-${PLUGIN_ELIXIR_VERSION}.ez \
&& wget -P $RABBITMQ_HOME/plugins/ https://github.com/noxdafox/rabbitmq-message-deduplication/releases/download/${PLUGIN_DEDUPLICATION_VERSION}/rabbitmq_message_deduplication-${PLUGIN_DEDUPLICATION_VERSION}.ez \
&& wget -P $RABBITMQ_HOME/plugins/ https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/releases/download/${PLUGIN_DELAYED_MESSAGES_VERSION}/rabbitmq_delayed_message_exchange-${PLUGIN_DELAYED_MESSAGES_VERSION}.ez \
&& chown -R rabbitmq:rabbitmq $RABBITMQ_HOME/plugins/ \
&& rabbitmq-plugins enable rabbitmq_management rabbitmq_message_deduplication rabbitmq_delayed_message_exchange
I executed
/ # rabbitmqctl stop_app
Stopping rabbit application on node rabbit@worker1-rabbitmq ...
/ # rabbitmqctl join_cluster rabbit@core-rabbitmq
Clustering node rabbit@worker1-rabbitmq with rabbit@core-rabbitmq
The node is already a member of this cluster
/ # rabbitmqctl start_app
Starting node rabbit@worker1-rabbitmq ...
# After some seconds and consuming many GB of RAM..
Error: unable to perform an operation on node 'rabbit@worker1-rabbitmq'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit@worker1-rabbitmq
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: ['rabbit@worker1-rabbitmq']
rabbit@worker1-rabbitmq:
* connected to epmd (port 4369) on worker1-rabbitmq
* epmd reports: node 'rabbit' not running at all
no other nodes on worker1-rabbitmq
* suggestion: start the node
Current node details:
* node name: 'rabbitmqcli-307-rabbit@worker1-rabbitmq'
* effective user's home directory: /var/lib/rabbitmq
* Erlang cookie hash: xxxxxxxx
/ # ERROR: 137
I would be really happy if you could check if this comes from the deduplication plugin, I see a bad_return_value
in context of the deduplication boot step (full log at pastebin, link above)
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [info] <0.929.0> Deduplication queues enabled, real BQ is rabbit_priority_queue
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [info] <0.929.0>
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> logger: removed_failing_filter
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> filter: {rmqlog_filter,{#Fun<rabbit_prelaunch_logging.14.111361164>,
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> #{global => debug,upgrade => none}}}
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> owner: rmq_2_stdout
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> log_event: #{level => info,
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> meta =>
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> #{domain => [rabbitmq],
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> gl => <0.928.0>,pid => <0.929.0>,
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> time => 1636716868517122},
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> msg =>
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> {<<"Deduplication queues enabled, real BQ is ~s~n">>,
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> [rabbit_priority_queue]}}
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> reason: {bad_return_value,#{level => info,
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> meta =>
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> #{domain => [rabbitmq],
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> gl => <0.928.0>,pid => <0.929.0>,
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> time => 1636716868517122},
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> msg =>
rabbitmq_1 | 2021-11-12 12:34:28.517122+01:00 [dbug] <0.929.0> {<<"Deduplication queues enabled, real BQ is ~s~n">>,
Thanks for your great support and this awesome plugin! :)
After upgrading my rabbitmq instance from 3.8.9 to 3.8.10 (or later), it crashes on startup with following log messages: I am using the latest v0.5.0 of this plugin.
Dockerfile