ClusterLabs / anvil

The Anvil! Intelligent Availability™ Platform, mark 3
5 stars 6 forks source link

striker-ui-api daemon dies on anvil-striker install #517

Closed digimer closed 9 months ago

digimer commented 10 months ago

I did a fresh build, and after installing the anvil-striker RPM on two hosts, both had failed striker-ui-api daemons;

[root@an-striker01 ~]# systemctl status striker-ui-api.service 
● striker-ui-api.service - Anvil! Intelligent Availability Platform - Striker UI API
   Loaded: loaded (/usr/lib/systemd/system/striker-ui-api.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2023-11-02 19:08:21 EDT; 11min ago
 Main PID: 32319 (code=exited, status=0/SUCCESS)

Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 6.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Start request repeated too quickly.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: Failed to start Anvil! Intelligent Availability Platform - Striker UI API.

The journald logs will be in the first comment.

digimer commented 10 months ago
[root@an-striker01 ~]# journalctl -b 0 -u striker-ui-api
-- Logs begin at Thu 2023-11-02 18:41:20 EDT, end at Thu 2023-11-02 19:19:27 EDT. --
Nov 02 19:08:13 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "gid": 1001,
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "stdio": "pipe",
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "timeout": 10000,
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "uid": 1001
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: }
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "gid": 0,
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "stdio": "pipe",
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "timeout": 10000,
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "uid": 0
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: }
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: Access interact: {
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]:   "script": "21b1106c-326d-4210-9127-188e267b6e65 r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: }
Nov 02 19:08:13 an-striker01.alteeve.com node[31920]: Starting process with ownership 0:0
Nov 02 19:08:14 an-striker01.alteeve.com kill[32029]: kill: cannot find process ""
Nov 02 19:08:14 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Control process exited, code=exited status=1
Nov 02 19:08:14 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:14 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:14 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 1.
Nov 02 19:08:14 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:14 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "gid": 1001,
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "stdio": "pipe",
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "timeout": 10000,
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "uid": 1001
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: }
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "gid": 0,
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "stdio": "pipe",
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "timeout": 10000,
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "uid": 0
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: }
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: Access interact: {
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]:   "script": "5d2c795c-5434-4554-8776-88223b250d92 r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: }
Nov 02 19:08:15 an-striker01.alteeve.com node[32033]: Starting process with ownership 0:0
Nov 02 19:08:15 an-striker01.alteeve.com kill[32078]: kill: cannot find process ""
Nov 02 19:08:15 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Control process exited, code=exited status=1
Nov 02 19:08:15 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:15 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:15 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 2.
Nov 02 19:08:15 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:15 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "gid": 1001,
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "stdio": "pipe",
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "timeout": 10000,
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "uid": 1001
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: }
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "gid": 0,
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "stdio": "pipe",
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "timeout": 10000,
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "uid": 0
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: }
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: Access interact: {
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]:   "script": "a50a9057-57c1-42dc-899e-5fd90469013d r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: }
Nov 02 19:08:16 an-striker01.alteeve.com node[32080]: Starting process with ownership 0:0
Nov 02 19:08:17 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Control process exited, code=exited status=1
Nov 02 19:08:17 an-striker01.alteeve.com kill[32138]: kill: cannot find process ""
Nov 02 19:08:17 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:17 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:17 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 3.
Nov 02 19:08:17 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:17 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "gid": 1001,
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "stdio": "pipe",
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "timeout": 10000,
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "uid": 1001
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: }
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "gid": 0,
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "stdio": "pipe",
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "timeout": 10000,
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "uid": 0
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: }
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: Access interact: {
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]:   "script": "bfc82275-96fc-4cd2-a9bf-353424f67add r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: }
Nov 02 19:08:18 an-striker01.alteeve.com node[32140]: Starting process with ownership 0:0
Nov 02 19:08:18 an-striker01.alteeve.com kill[32213]: kill: cannot find process ""
Nov 02 19:08:18 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Control process exited, code=exited status=1
Nov 02 19:08:18 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:19 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:19 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 4.
Nov 02 19:08:19 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:19 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "gid": 1001,
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "stdio": "pipe",
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "timeout": 10000,
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "uid": 1001
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: }
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "gid": 0,
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "stdio": "pipe",
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "timeout": 10000,
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "uid": 0
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: }
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: Access interact: {
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]:   "script": "d974305b-5754-4843-81b3-df77e3c78fc3 r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: }
Nov 02 19:08:19 an-striker01.alteeve.com node[32245]: Starting process with ownership 0:0
Nov 02 19:08:20 an-striker01.alteeve.com kill[32317]: kill: cannot find process ""
Nov 02 19:08:20 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Control process exited, code=exited status=1
Nov 02 19:08:20 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:20 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:20 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 5.
Nov 02 19:08:20 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:20 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "gid": 1001,
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "stdio": "pipe",
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "timeout": 10000,
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "uid": 1001
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: }
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: Starting anvil-access-module daemon with options: {
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "gid": 0,
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "stdio": "pipe",
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "timeout": 10000,
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "uid": 0
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: }
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: Access interact: {
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]:   "script": "a2f2921c-25bb-46ab-882a-6b13b3f8a4e4 r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: }
Nov 02 19:08:21 an-striker01.alteeve.com node[32319]: Starting process with ownership 0:0
Nov 02 19:08:21 an-striker01.alteeve.com kill[32361]: kill: cannot find process ""
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Control process exited, code=exited status=1
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Service RestartSec=100ms expired, scheduling restart.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Scheduled restart job, restart counter is at 6.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: Stopped Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Start request repeated too quickly.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: striker-ui-api.service: Failed with result 'exit-code'.
Nov 02 19:08:21 an-striker01.alteeve.com systemd[1]: Failed to start Anvil! Intelligent Availability Platform - Striker UI API.
digimer commented 10 months ago

Restarting the striker-ui-api daemon fixed the problem;

[root@an-striker01 ~]# systemctl restart striker-ui-api.service 

Generates;

[root@an-striker01 ~]# journalctl -f -n 0 -u striker-ui-api 
-- Logs begin at Thu 2023-11-02 18:41:20 EDT. --
Nov 02 19:23:44 an-striker01.alteeve.com systemd[1]: Started Anvil! Intelligent Availability Platform - Striker UI API.
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Starting anvil-access-module daemon with options: {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "gid": 1001,
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "stdio": "pipe",
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "timeout": 10000,
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "uid": 1001
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Starting anvil-access-module daemon with options: {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "gid": 0,
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "stdio": "pipe",
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "timeout": 10000,
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "uid": 0
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Access interact: {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "script": "11f06c44-74c6-43e1-974d-86e09bb41582 r SELECT variable_value FROM variables WHERE variable_name = 'striker-ui-api::session::secret';\n"
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Starting process with ownership 0:0
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Access interact 11f06c44-74c6-43e1-974d-86e09bb41582 returns: {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "result": []
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Failed to get session secret from database; CAUSE: AssertionError [ERR_ASSERTION]: No existing session secret found.
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Generated a new session secret.
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Access interact: {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "script": "803ad6d0-9aae-498b-ae95-e1ce139f4d21 x Database->insert_or_update_variables \"{\\\"file\\\":\\\"/usr/share/striker-ui-api/index.js\\\",\\\"variable_name\\\":\\\"striker-ui-api::session::secret\\\",\\\"variable_value\\\":\\\"OsKTqFB9BANX0g9mj0GWtX3fcqGY+uC0CGRseReycqg=\\\"}\"\n"
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Access interact 803ad6d0-9aae-498b-ae95-e1ce139f4d21 returns: {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   "result": {
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:     "sub_results": [
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:       "8c60af65-0659-4746-81d0-ca6f04199787"
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:     ]
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]:   }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: }
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Recorded session secret as variable identified by 8c60af65-0659-4746-81d0-ca6f04199787.
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/anvil with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/command with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/fence with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/file with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/host with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/job with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/manifest with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/network-interface with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/server with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/ssh-key with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/ups with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/user with 2 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/auth with 1 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/echo with 1 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Set up route /api/init with 1 handler(s)
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Process ownership changed to 1001:1001.
Nov 02 19:23:45 an-striker01.alteeve.com node[92790]: Listening on localhost:80.
digimer commented 9 months ago

Add more logging / fault management, not very reproducible.