Closed RobertFloor closed 1 year ago
This is the unit file that works in version 1.3.1
[ansible@amq2 system]$ cat amq-broker.service
# Ansible managed
[Unit]
Description=amq-broker Apache ActiveMQ Service
After=network.target
RequiresMountsFor=/data/amq-broker/shared/mount
[Service]
Type=forking
EnvironmentFile=-/etc/sysconfig/amq-broker
PIDFile=/opt/amq/amq-broker/data/artemis.pid
ExecStart=/opt/amq/amq-broker/bin/artemis-service start
ExecStop=/opt/amq/amq-broker/bin/artemis-service stop
SuccessExitStatus = 0 143
RestartSec = 120
Restart = on-failure
LimitNOFILE=102642
TimeoutSec=600
ExecStartPost=/usr/bin/timeout 60 sh -c 'tail -f /opt/amq/amq-broker/log/artemis.log | sed "/AMQ221034/ q"'
[Install]
WantedBy=multi-user.target
The difference seems to be in the line ExecStartPost=/usr/bin/timeout 60 sh -c 'tail -f /opt/amq/amq-broker/log/artemis.log | sed "/AMQ221034/ q"'
[ansible@amq2 sysconfig]$ cat amq-broker
# Ansible managed
JAVA_ARGS='-Xms512M -Xmx2G -XX:+PrintClassHistogram -XX:+UseG1GC -XX:+UseStringDeduplication -Dhawtio.disableProxy=true -Dhawtio.realm=activemq -Dhawtio.offline=true -Dhawtio.rolePrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal -Djolokia.policyLocation=file:/opt/amq/amq-broker/etc/jolokia-access.xml'
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.19.0.7-1.el8_7.x86_64
HAWTIO_ROLE='amq'
ARTEMIS_INSTANCE_URI='file:/opt/amq/amq-broker/'
ARTEMIS_INSTANCE_ETC_URI='file:/opt/amq/amq-broker/etc/'
ARTEMIS_HOME='/opt/amq/amq-broker-7.10.2'
ARTEMIS_INSTANCE='/opt/amq/amq-broker'
ARTEMIS_DATA_DIR='/data/amq-broker/shared/mount'
ARTEMIS_ETC_DIR='/opt/amq/amq-broker/etc'
Why is the log code changed to AMQ221001 from AMQ221034? I can confirm if I change it back to AMQ221034 fixes the problem
Hello @RobertFloor ; with 1.3.2 master/backup shared store policy is implemented. Formerly two live-only master would race on the live lock (in the shared store). That is still the default, and in this config, you found the bug being the AMQ code; sorry abou that.
But if you wish you can switch to a proper master/backup setup, after you set activemq_ha_role: 'master' for one node, and activemq_ha_role: 'slave' for the other node.
IN that scenario, the systemd unit will respectively wait for:
AMQ221001: Apache ActiveMQ Artemis Message Broker version 2.21.0.redhat-00030 [amq-broker, nodeID=...]
(for the master)AMQ221109: Apache ActiveMQ Artemis Backup Server version 2.21.0.redhat-00030 [..] started, waiting live to fail before it gets active
(for the backup)Hi thanks for the answer and fixing. However I believe this behavior is not desired. Say we have an emergency and the master is down and stays down (OS corrupt, hardware failure or something). The new setting will mean that we would never able to manage the slave using systemctl in this case. It is not guaranteed that the slave will always be the backup. Sometimes the slave needs to be the active broker and we still would like to control it using systemctl (say systemctl restart on the slave if the master is down).
The new setting will mean that we would never able to manage the slave using systemctl in this case.
I am not sure I follow your requirements here, the backup node would be managed same as before, just it will emit different logging when its master goes down and it starts picking up the connections. Anyway, the default as before the change will stay, both because it's the default for the artemis create
command to not setup ha-policy when the ha role is not passed (meaning it defaults to the xsd 'live-only'), and for backwards compatibility. There are a few issues at the moment between github actions, molecule and docker in our CI, as soon as it gets back under control, merging the linked PR should fix this issue.
I can't answer for the Ansible side of things but I think you have your brokers mis-configured, you can't have two Live only brokers using shared store as a pair or even clustered with other brokers.. The choices you have are:
Clustered Masters
This is a group of master brokers all live that are clustered to distribute messages between them where each broker has its own Journal
HA Pairs (Shared store or replicated)
A master and a slave Broker where only the Master is live and in the shared store case share the same journal in a shared FS
Live Only
A single distinct broker with its own journal, that is no backup and no other brokers.
Of course you can also have a cluster of HA Pairs
Hope that helps
SUMMARY
After updating to version 1.3.2 of the playbook I can no longer start the broker via systemctl. I can start the master via systemctl but the second broker fails to start. We are deploying a two node shared storage setup. We have setup an NFS mount. In version 1.3.0 and 1.3.1 the same playbook works fine.
ISSUE TYPE
ANSIBLE VERSION
COLLECTION VERSION
STEPS TO REPRODUCE
Install the playbook with the command ansible-playbook -e "activemq_version=7.10.2" -i hostfiles/AMQ-dev-shared-storage.yml playbooks/mount-nfs-install-broker.yml
EXPECTED RESULTS
I expect to start both broker via systemctl during the palybook run
ACTUAL RESULTS
The second broker fails to start via systemctl. I could not find a specific reason why the broker would not start.