canonical / charm-openstack-service-checks

Collection of Nagios checks and other utilities that can be used to verify the operation of an OpenStack cluster
0 stars 2 forks source link

NRPE incorrectl returns OK from rally test #45

Closed sudeephb closed 7 months ago

sudeephb commented 7 months ago

NRPE is reporting status OK, but the actual status should be CRITICAL:

/home/nagiososc# cat rally.status {"message": "CRITICAL: fcbtest.rally command failed. Command '['fcbtest.rally', '--use-json', 'verify', 'start', '--load-list', '/home/nagiososc/ostests.txt', '--detailed']' returned non-zero exit status 220 - b'{\"message\": \"Configuring verifier \'tempestverifier\' (UUID=3f50d161-4909-4779-a138-7e9d49634120) for deployment \'snap_generated\' (UUID=106bc4b5-7d74-49ee-a956-09879435666d).\", \"asctime\": \"2019-09-29 23:45:12\", \"name\": \"rally.api\", \"msg\": \"Configuring verifier \'tempestverifier\' (UUID=3f50d161-4909-4779-a138-7e9d49634120) for deployment \'snap_generated\' (UUID=106bc4b5-7d74-49ee-a956-09879435666d).\", \"args\": [], \"levelname\": \"INFO\", \"levelno\": 20, \"pathname\": \"/snap/fcbtest/7/lib/python3.6/site-packages/rally/common/logging.py\", \"filename\": \"logging.py\", \"module\": \"logging\", \"lineno\": 99, \"funcname\": \"info\", \"created\": 1569800712.0627506, \"msecs\": 62.75057792663574, \"relative_created\": 4828.602313995361, \"thread\": 140643060741952, \"thread_name\": \"MainThread\", \"process_name\": \"MainProcess\", \"process\": 1420826, \"traceback\": null, \"hostname\": \"juju-2b11a1-77-lxd-3\", \"error_summary\": \"\", \"context\": {}, \"extra\": {\"project\": \"rally\", \"version\": \"unknown\"}}\x1b[00m\nSSL exception connecting to https://keystone.url:35357/v3: HTTPSConnectionPool(host=\'keystone.url\', port=35357): Max retries exceeded with url: /v3 (Caused by SSLError(SSLError(\"bad handshake: Error([(\'SSL routines\', \'tls_process_server_certificate\', \'certificate verify failed\')],)\",),))\n'"}


Imported from Launchpad using lp2gh.

sudeephb commented 7 months ago

(by woutervb) Did some checking in the code, and the problem is caused by the fact that for this case, a return code of 0 is given (which means OK), and not 2 to reflect the critical status.