canonical / hotsos

Software analysis toolkit. Define checks in high-level language and leverage library to perform analysis of common Cloud applications.
Apache License 2.0
33 stars 38 forks source link

Octavia Amphora: Logging Issue #981

Open lathiat opened 1 month ago

lathiat commented 1 month ago

It seems there is a bug in Python < 3.11, causing amphora-agent to sometimes error out which then puts the LB in an error state. It's not yet fixed, but it would be useful to detect the error.

Related: gunicorn #2073 Related: https://bugs.python.org/issue43196

amphora-agent[1759]: FileNotFoundError: [Errno 2] No such file or directory
amphora-agent[1759]: Message: 'PUT /1.0/loadbalancer/0fca9ddb-53bc-47d9-ae90-ddf27a9c0567/reload'
amphora-agent[1759]: Message: 'PUT /1.0/vrrp/upload'
amphora-agent[1759]:  File "/usr/lib/python3.8/logging/handlers.py", line 855, in _connect_unixsocket
amphora-agent[1759]:   self.socket.connect(address)
amphora-agent[1759]: FileNotFoundError: [Errno 2] No such file or directory
Keepalived_vrrp[2478]: VRRP_Script(check_script) failed (exited with status 3)
Keepalived_vrrp[2478]: (6b8112f598854d9aacc8a593fe8bab4a) Entering FAULT STATE
Keepalived_amphora-haproxy[2138]: Stopping
Keepalived_healthcheckers_amphora-haproxy[2139]: Shutting down service [192.168.0.193]:udp:36006 from VS [192.168.0.244]:udp:20101
Keepalived_healthcheckers_amphora-haproxy[2139]: Shutting down service [192.168.0.61]:udp:36006 from VS [192.168.0.244]:udp:20101
Keepalived_healthcheckers_amphora-haproxy[2139]: Shutting down service [192.168.0.71]:udp:36006 from VS [192.168.0.244]:udp:20101
Keepalived_healthcheckers_amphora-haproxy[2139]: Stopped
Keepalived_amphora-haproxy[2138]: Stopped Keepalived v2.0.19 (10/19,2019)
Keepalived_amphora-haproxy[2138]: unmount of /run/keepalived/ failed - errno 22