Open hyderaliva opened 2 months ago
Can you validate this on a current version of Graylog (v6.0)?
Took a quick look at this and can confirm that the graylog-sidecar
service does stay in an active (running) state even if it loses connectivity with its Graylog cluster. This appears to be working as designed though as the service itself remains running so that it can continue to retry communication with its Graylog cluster.
I suggest 2 things to monitor graylog cluster health:
/var/log/graylog-sidecar/sidecar.log
(for example a script running as a cron job to check for errors), when sidecar is unable to connect to its graylog cluster it logs the following:
Problem description
The Graylog Sidecar
systemctl
status always shows 'running,' even when there is a connectivity issue between the Graylog API server and the Graylog Sidecar agent. As a result, we lose critical event logs in the Graylog web console. We currently usesystemctl status
to monitor the Graylog Sidecar agent, but this approach seems ineffective.Please suggest a suitable method to monitor graylog-sidecar agent, ensuring issues are addressed promptly and critical events are not missed in the Graylog console.
The graylog-sidecar config and systemctl files are as follows,
sidecar.yml
graylog-sidecar.service `[Unit] Description=Wrapper service for Graylog controlled collector ConditionFileIsExecutable=/usr/bin/graylog-sidecar
[Service] StartLimitInterval=5 StartLimitBurst=10 ExecStart=/usr/bin/graylog-sidecar Restart=always RestartSec=120 EnvironmentFile=-/etc/sysconfig/graylog-sidecar
[Install] WantedBy=multi-user.target`
Environment
Thanks, Hyder