ansible-middleware / amq

A collection to manage AMQ brokers
Apache License 2.0
17 stars 11 forks source link

Difference in ownership between different log files at different days. #75

Closed RobertFloor closed 1 year ago

RobertFloor commented 1 year ago
SUMMARY

Difference in ownership between different log files. AMQ creates a new log file each day. The log file in the directory /var/log/activemq/amq-broker are owner by either amq-broker or by root. For our monitoring solution it would be best if they are not owned by root. The initial file is owned by amq-broker, but later ones are owned by root.

We are running AMQ 7.11

#------
Collection                                Version
----------------------------------------- -------
ansible.posix                             1.5.2
community.general                         6.0.1
middleware_automation.amq                 1.3.5
middleware_automation.common              1.1.0
middleware_automation.redhat_csp_download 1.2.2
/var/log/activemq/amq-broker
bash-4.4$ ls -alh
total 172K
drwxr-xr-x. 2 amq-broker amq-broker  102 Jun  5 10:20 .
drwxrwxr-x. 9 amq-broker amq-broker   85 May 22 16:48 ..
-rw-r--r--. 1 root       root        49K Jun  5 13:43 artemis.log
-rw-r--r--. 1 amq-broker amq-broker  16K May 22 16:51 artemis.log.2023-05-22
-rw-r--r--. 1 root       root       103K Jun  2 15:38 artemis.log.2023-06-02
-rw-r--r--. 1 root       root          0 Jun  5 10:20 audit.log
ISSUE TYPE

improvement

RobertFloor commented 1 year ago

A second issue with logging, I cannot see failed login attempts in the artemis.log and the audit log is empty.

guidograzioli commented 1 year ago

It may mean the process runs as the root user after it's restarted in systemd, not good; about the audit, have you turned it on with activemq_enable_audit?

RobertFloor commented 1 year ago

It seems to be a difference between AMQ version 7.10 and 7.11. On 7.10 is is correct:

$ pwd
/opt/amq/amq-broker/log
$ ll
total 780
-rw-r--r--. 1 amq-broker amq-broker  18877 May 12 15:38 artemis.log
-rw-r--r--. 1 amq-broker amq-broker 778182 May 12 16:39 audit.log
RobertFloor commented 1 year ago

But this broker was not restarted with systemctl so it seems a systemctl restart amq-broker chnages ownership of the log files

RobertFloor commented 1 year ago

Any idear how we can fix this?

guidograzioli commented 1 year ago

Working on it; unfortunately it depends on the logging library change on 7.11 which went to Log4j 2 logging utility instead of the JBoss Logging framework.

guidograzioli commented 1 year ago

Hello Robert, there were several issues related to this one; first and foremost the log4j2 template would not be deployed at all (so the audit log would not honor the variable to enable it). Plus, the service was running as the root user, because at some point in time the ACTIVEMQ_USER environment variable has disappeared from the configuration file for the systemd unit. The attached PR, which I merged, tries to put everything back in place, while trying to be backwards compatible, I'd really appreciate if you could test the current HEAD in your environment (or I can make a release on galaxy eventually)

RobertFloor commented 1 year ago

I will check it

RobertFloor commented 1 year ago

Unfortuately systemd does not want to start it says amq-broker.service: Refusing to accept PID outside of service control group, acquired through unsafe symlink chain: /opt/amq/amq-broker/data/artemis.pid

I confirmed this is a bug in 1.3.6 because if I install 1.3.5 there is no problem.

RobertFloor commented 1 year ago
RUNNING HANDLER [middleware_automation.amq.activemq : Restart and enable instance amq-broker for activemq service] ***********************************************************************************
task path: /home/robert/.ansible/collections/ansible_collections/middleware_automation/amq/roles/activemq/tasks/restart.yml:2
fatal: [amq1]: FAILED! => changed=false
  msg: |-
    Unable to start service amq-broker: Job for amq-broker.service failed because the service did not take the steps required by its unit configuration.
    See "systemctl status amq-broker.service" and "journalctl -xe" for details.
fatal: [amq2]: FAILED! => changed=false
  msg: |-
    Unable to start service amq-broker: Job for amq-broker.service failed because the service did not take the steps required by its unit configuration.
    See "systemctl status amq-broker.service" and "journalctl -xe" for details.

    -------------------------------------------------------------------------------------------

    [ansible@amq1 ~]$ sudo systemctl status amq-broker.service
● amq-broker.service - amq-broker Apache ActiveMQ Service
   Loaded: loaded (/etc/systemd/system/amq-broker.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: protocol) since Wed 2023-06-21 15:20:32 UTC; 44s ago
  Process: 56327 ExecStop=/opt/amq/amq-broker/bin/artemis-service stop (code=exited, status=0/SUCCESS)
  Process: 56295 ExecStartPost=/usr/bin/timeout 60 sh -c tail -n 15 -f /opt/amq/amq-broker/log/artemis.log | sed "/AMQ221034/ q" && /bin/sleep 10 (code=exited, status=0/SUCCESS)
  Process: 56247 ExecStart=/opt/amq/amq-broker/bin/artemis-service start (code=exited, status=0/SUCCESS)
 Main PID: 43056 (code=exited, status=143)

Jun 21 15:20:32 amq1.test.local systemd[1]: amq-broker.service: Failed with result 'protocol'.
Jun 21 15:20:32 amq1.test.local systemd[1]: Failed to start amq-broker Apache ActiveMQ Service.
-------------------------------------------------------------------------------------------------

[ansible@amq1 ~]$ sudo journalctl -xe
-- Unit session-c21.scope has finished starting up.
--
-- The start-up result is done.
Jun 21 15:22:32 amq1.test.local sudo[56402]: pam_unix(sudo:session): session opened for user amq-broker by (uid=0)
Jun 21 15:22:33 amq1.test.local artemis-service[56399]: artemis-service is now running (56402)
Jun 21 15:22:33 amq1.test.local systemd[1]: amq-broker.service: Refusing to accept PID outside of service control group, acquired through unsafe symlink chain: /opt/amq/amq-broker/data/artemis.pid
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,183 WARN  [org.apache.activemq.artemis.core.server] AMQ222269: Please use a fixed value for "journal-pool-files". Default changed>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,200 INFO  [org.apache.activemq.artemis.core.server] AMQ221082: Initializing metrics plugin com.redhat.amq.broker.core.server.metr>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,240 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting ActiveMQ Artemis Server version 2.28.0.redhat-0>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,333 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: live Message Broker is starting with configuration Broker Configur>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,334 INFO  [org.apache.activemq.artemis.core.server] AMQ221006: Waiting to obtain live lock
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,402 INFO  [org.apache.activemq.artemis.core.server] AMQ221012: Using AIO Journal
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,468 INFO  [org.apache.activemq.artemis.core.server] AMQ221057: Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,547 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-server]. Adding protocol support f>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,548 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol su>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,548 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,549 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol su>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,549 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protoco>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,550 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol s>
Jun 21 15:22:33 amq1.test.local timeout[56449]: 2023-06-21 15:20:23,899 INFO  [org.apache.activemq.artemis.core.server] AMQ221034: Waiting indefinitely to obtain live lock
Jun 21 15:22:43 amq1.test.local systemd[1]: amq-broker.service: Refusing to accept PID outside of service control group, acquired through unsafe symlink chain: /opt/amq/amq-broker/data/artemis.pid
Jun 21 15:22:43 amq1.test.local systemd[1]: amq-broker.service: Refusing to accept PID outside of service control group, acquired through unsafe symlink chain: /opt/amq/amq-broker/data/artemis.pid
Jun 21 15:22:43 amq1.test.local artemis-service[56476]: Gracefully Stopping artemis-service
Jun 21 15:22:44 amq1.test.local sudo[56402]: pam_unix(sudo:session): session closed for user amq-broker
Jun 21 15:22:44 amq1.test.local systemd[1]: session-c21.scope: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
--
-- The unit session-c21.scope has successfully entered the 'dead' state.
Jun 21 15:22:44 amq1.test.local systemd[1]: amq-broker.service: Failed with result 'protocol'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
--
-- The unit amq-broker.service has entered the 'failed' state with result 'protocol'.
Jun 21 15:22:44 amq1.test.local systemd[1]: Failed to start amq-broker Apache ActiveMQ Service.
-- Subject: Unit amq-broker.service has failed
-- Defined-By: systemd
-- Support: https://access.redhat.com/support
--
-- Unit amq-broker.service has failed.
--
-- The result is failed.
Jun 21 15:22:44 amq1.test.local sudo[56491]:  ansible : TTY=pts/0 ; PWD=/home/ansible ; USER=root ; COMMAND=/bin/journalctl -xe
Jun 21 15:22:44 amq1.test.local sudo[56491]: pam_unix(sudo:session): session opened for user root by ansible(uid=0)
RobertFloor commented 1 year ago
[ansible@amq1 system]$ cat amq-broker.service
# Ansible managed
[Unit]
Description=amq-broker Apache ActiveMQ Service
After=network.target
RequiresMountsFor=/data/amq-broker/shared/mount

[Service]
Type=forking
EnvironmentFile=-/etc/sysconfig/amq-broker
PIDFile=/opt/amq/amq-broker/data/artemis.pid
ExecStart=/opt/amq/amq-broker/bin/artemis-service start
ExecStop=/opt/amq/amq-broker/bin/artemis-service stop
SuccessExitStatus = 0 143
RestartSec = 120
Restart = on-failure
LimitNOFILE=102642
TimeoutSec=600
ExecStartPost=/usr/bin/timeout 60 sh -c 'tail -n 15 -f /opt/amq/amq-broker/log/artemis.log | sed "/AMQ221034/ q" && /bin/sleep 10'

[Install]
WantedBy=multi-user.target
[ansible@amq1 system]$ cat /opt/amq/amq-broker/data/artemis.pid
cat: /opt/amq/amq-broker/data/artemis.pid: Permission denied
[ansible@amq1 system]$ sudo cat /opt/amq/amq-broker/data/artemis.pid
cat: /opt/amq/amq-broker/data/artemis.pid: No such file or directory
[ansible@amq1 system]$ ls  /opt/amq/amq-broker/data/
ls: cannot access '/opt/amq/amq-broker/data/': Permission denied
[ansible@amq1 system]$ sudo ls  /opt/amq/amq-broker/data/
[ansible@amq1 system]$ sudo ls -alh  /opt/amq/amq-broker/data/
total 0
drwxr-xr-x. 2 root       root        6 Jun 21 15:27 .
drwxr-x---. 9 amq-broker amq-broker 85 Jun 21 15:18 ..
[ansible@amq1 system]$
guidograzioli commented 1 year ago

I am trying to reproduce but at the moment I couldn't. Can you verify no symlink is in the path to pidfile please? /opt/amq/amq-broker/data/artemis.pid

Also, I'd need to know the owner:group of the data directory and the contents of /etc/sysconfig/amq-broker if possible

RobertFloor commented 1 year ago
[root@amq2 data]# cat /etc/sysconfig/amq-broker
# Ansible managed
JAVA_ARGS='-Xms512M -Xmx2G -XX:+PrintClassHistogram -XX:+UseG1GC -XX:+UseStringDeduplication -Dhawtio.disableProxy=true -Dhawtio.realm=activemq -Dhawtio.offline=true -Dhawtio.rolePrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal -Djolokia.policyLocation=file:/opt/amq/amq-broker/etc/jolokia-access.xml'
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.19.0.7-4.el8.x86_64
HAWTIO_ROLE='amq'
ARTEMIS_INSTANCE_URI='file:/opt/amq/amq-broker/'
ARTEMIS_INSTANCE_ETC_URI='file:/opt/amq/amq-broker/etc/'
ARTEMIS_HOME='/opt/amq/amq-broker-7.11.0'
ARTEMIS_INSTANCE='/opt/amq/amq-broker'
ARTEMIS_DATA_DIR='/data/amq-broker/shared/mount'
ARTEMIS_ETC_DIR='/opt/amq/amq-broker/etc'
RobertFloor commented 1 year ago

I only see a symlink here but not in the directories itself.

/opt/amq
[root@amq2 amq]# ls -l
total 59640
drwxr-x---. 9 amq-broker amq-broker       85 Jun 21 18:53 amq-broker
lrwxrwxrwx. 1 root       root             44 Jun 21 18:51 amq-broker-7.11.0 -> /opt/amq/apache-artemis-2.28.0.redhat-00003/
-rwxr-x---. 1 amq-broker amq-broker 61070103 Jun 21 18:51 amq-broker-7.11.0-bin.zip
drwxr-xr-x. 9 amq-broker amq-broker      149 Jun 21 18:51 apache-artemis-2.28.0.redhat-00003

Owner of the data dir

[root@amq2 data]# ls -alh
total 0
drwxr-xr-x. 2 root       root        6 Jun 21 19:08 .
drwxr-x---. 9 amq-broker amq-broker 85 Jun 21 18:53 ..
[root@amq2 data]# pwd
/opt/amq/amq-broker/data
[root@amq2 data]#
RobertFloor commented 1 year ago

I can confirm again that 1.3.5 is working fine with the same playbook. Is there a way to compare the two releases file by file?

RobertFloor commented 1 year ago

If I removed this line from/etc/sysconfig/amq-broker, systemctl start amq-broker starts again and I see no adverse affects ARTEMIS_USER=amq-broker

Likely linked to this part ofbin/artemis-service

  status
  if [ $? -eq 0 ] ; then
    echo "Already running."
    return 1
  fi

  if [ -z "$ARTEMIS_USER" -o `id -un` = "$ARTEMIS_USER" ] ; then
    nohup "${ARTEMIS_INSTANCE}/bin/artemis" run > /dev/null 2> /dev/null &
  else
    sudo -n -u ${ARTEMIS_USER} nohup "${ARTEMIS_INSTANCE}/bin/artemis" run > /dev/null 2> /dev/null &
  fi
guidograzioli commented 1 year ago

So the line: ARTEMIS_USER=amq-broker is in your /etc/sysconfig/amq-broker.conf or not? Ideally, the pidfile would be owned by root, the systemd unit would be run by root too, while the artemis scripts does the user switching from root to amq-broker.

But at this point, I am tending towards making all run under amq-broker user

Can you please report:

uname -a
rpm -qa| grep systemd
guidograzioli commented 1 year ago

Can you try HEAD after this merged PR?

RobertFloor commented 1 year ago

Thanks I can confirm it works now :) Also the log files are owned by amq-broker now and I see data in the audit log after setting activemq_enable_audit: true

[root@amq1 log]# pwd
/opt/amq/amq-broker/log
[root@amq1 log]# ls -alh
total 12K
drwxr-xr-x. 2 amq-broker amq-broker   42 Jun 22 14:35 .
drwxr-xr-x. 9 amq-broker amq-broker   85 Jun 22 14:35 ..
-rw-r--r--. 1 amq-broker amq-broker 8.0K Jun 22 14:38 artemis.log
-rw-r--r--. 1 amq-broker amq-broker 3.7K Jun 22 14:38 audit.log
guidograzioli commented 1 year ago

Fair enough; CI tests are also good so I'll release. Thanks for the help!