codership / galera-manager-support

Galera Manager Support Repository
7 stars 2 forks source link

No monitoring or logs after upgrade from 1.6.5 to 1.8.3 #92

Open dribblecastle opened 5 months ago

dribblecastle commented 5 months ago

I've been testing a few upgrades in our staging cluster (monitoring only) and have found that I can't get any Monitoring stats or Logs to populate when running Galera Manager. This includes the green text for status (synced/donor). I've tried several times to delete and recreate the cluster but the issue persits.

The deployment logs seem to indicate success, but I've noticed when getting the status of the telegraf service on my nodes that it shows the following. The username(gmd) and passwords seem to be set properly. Note, I've tried to purge telegraf as well as deleting the /etc/galera-manager-node in an attempt to get a "fresh" start for the node. The logs below seem to indicated to me that telegraf hasn't been able to authenticate

● telegraf.service - Telegraf
     Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2024-02-13 17:01:47 PST; 6s ago
       Docs: https://github.com/influxdata/telegraf
   Main PID: 450163 (telegraf)
      Tasks: 20 (limit: 14249)
     Memory: 39.1M
     CGroup: /system.slice/telegraf.service
             ├─450163 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
             └─450185 /usr/local/bin/mysql_wsrep -config /etc/telegraf/mysql_wsrep-telegraf-plugin.conf

Feb 13 17:01:47 mariadb-stage01 systemd[1]: Started Telegraf.
Feb 13 17:01:47 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:47Z I! [agent] Config: Interval:1s, Quiet:false, Hostname:"mariadb-stage01", Flush Interval:1s
Feb 13 17:01:47 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:47Z W! [outputs.influxdb] When writing to [http://mariadb-stage-manager.contoso.com:8081]: database "gmd" creation failed: 401 Unauthorized
Feb 13 17:01:47 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:47Z I! [inputs.execd] Starting process: /usr/local/bin/mysql_wsrep [-config /etc/telegraf/mysql_wsrep-telegraf-plugin.conf]
Feb 13 17:01:48 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:48Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 401 Unauthorized):
Feb 13 17:01:49 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:49Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 401 Unauthorized):
Feb 13 17:01:50 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:50Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 401 Unauthorized):
Feb 13 17:01:51 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:51Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 401 Unauthorized):
Feb 13 17:01:52 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:52Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 401 Unauthorized):
Feb 13 17:01:53 mariadb-stage01 telegraf[450163]: 2024-02-14T01:01:53Z E! [outputs.influxdb] E! [outputs.influxdb] Failed to write metric (will be dropped: 401 Unauthorized):

Has anyone had any issues with the versions below? I know they are relatively new releases.

Before upgrade MariaDB - 10.4.28 Galera4 - 26.4.14 GMD - 1.6.5

After upgrade (where I have the problem) MariaDB - 10.4.33 Galera4 - 26.4.16 GMD 1.8.3

dribblecastle commented 5 months ago

Well, I believe I found my issue. I had an odd DNS config set on my staging servers, which prevented proper communication between the services.

byte commented 1 month ago

@dribblecastle i presume that this is now fixed for you and you're happy using Galera Manager?