Closed Neha-Sinha2305 closed 4 years ago
url: http://localhost:8500/v1/health/checks/count-dashboard
Earlier, I proposed not filtering on "passing" and just fetching all healthchecks, then failing if the resulting json contains failing
or warning
. I cannot quite remember who were in that discussion (I'll go looking for it now), but are you all still for explicitly checking for a number of tasks? I like not checking for a number because it needs less maintenance (changing number of checks won't break any ansible-healthchecks), but I remember at least Nikita wanted to use that method of checking because it is explicit. Any thoughts?
url: http://localhost:8500/v1/health/checks/count-dashboard
Earlier, I proposed not filtering on "passing" and just fetching all healthchecks, then failing if the resulting json contains
failing
orwarning
. I cannot quite remember who were in that discussion (I'll go looking for it now), but are you all still for explicitly checking for a number of tasks? I like not checking for a number because it needs less maintenance (changing number of checks won't break any ansible-healthchecks), but I remember at least Nikita wanted to use that method of checking because it is explicit. Any thoughts?
Do you mean changing the code to :
- name: countdash healthcheck test without consul token (when consul acl default policy is allow)
uri:
url: http://localhost:8500/v1/health/checks/count-dashboard
method: GET
return_content: yes
status_code: 200
body_format: json
when: not lookup('env', 'consul_acl') | bool
register: result_countdashboard
retries: 15
delay: 15
until: '"critical" not in result_countdashboard | string'
tags: test
If yes, then I would better go with the current implementation where we explicitly filter for "passing" status. (I can explain/prove it with code commits that we can revert later)
If yes, then I would better go with the current implementation where we explicitly filter for "passing" status. (I can explain/prove it with code commits that we can revert later)
Yup, that is what I was thinking of, except checking for warning as well. Yes, I would very much like an explanation as to why! 😀
If yes, then I would better go with the current implementation where we explicitly filter for "passing" status. (I can explain/prove it with code commits that we can revert later)
Yup, that is what I was thinking of, except checking for warning as well. Yes, I would very much like an explanation as to why! 😀
Here you go. The latest commit has the following two ansible tasks that does the healthcheck when consul_acl_default_policy is set to deny. Inorder to show the difference in behavior i have taken away the consul token and so expect both of these to fail, however, you will notice that the first one does not fail and behaves as a false positive. Reason being that the api returns empty list when called without the token and so the first test passes as the verification point is not so explicit.Note: It would behave the same even if we check for "warning" or any other string. Please have a look at the latest github actions check for more details.
And if it is unclear then surely we should discuss it further(may be over a call). :)
Here you go. The latest commit has the following two ansible tasks that does the healthcheck when consul_acl_default_policy is set to deny. Inorder to show the difference in behavior i have taken away the consul token and so expect both of these to fail, however, you will notice that the first one does not fail and behaves as a false positive. Reason being that the api returns empty list when called without the token and so the first test passes as the verification point is not so explicit.Note: It would behave the same even if we check for "warning" or any other string. Please have a look at the latest github actions check for more details.
Thank you so much! :) If I understand correctly it's a false positive because the list is empty, and therefore does not contain either critical
or warning
, but we could solve that by adding and length != 0
? Then we have a type of check that could be universally used by all services in the future (which is valuable IMO). I realise I'm adding more and more complexity to my "simple" solution though. 😄
I'm open to the fact that I might be missing even more problems with my approach, but to me it seems easier, quicker, and more universal. It's not a huge deal, and I am OK with keeping it as is if you (and/or others) think this is a better solution! I realise this is the way we've done it until now as well.
Here you go. The latest commit has the following two ansible tasks that does the healthcheck when consul_acl_default_policy is set to deny. Inorder to show the difference in behavior i have taken away the consul token and so expect both of these to fail, however, you will notice that the first one does not fail and behaves as a false positive. Reason being that the api returns empty list when called without the token and so the first test passes as the verification point is not so explicit.Note: It would behave the same even if we check for "warning" or any other string. Please have a look at the latest github actions check for more details.
Thank you so much! :) If I understand correctly it's a false positive because the list is empty, and therefore does not contain either
critical
orwarning
, but we could solve that by addingand length != 0
? Then we have a type of check that could be universally used by all services in the future (which is valuable IMO). I realise I'm adding more and more complexity to my "simple" solution though. 😄I'm open to the fact that I might be missing even more problems with my approach, but to me it seems easier, quicker, and more universal. It's not a huge deal, and I am OK with keeping it as is if you (and/or others) think this is a better solution! I realise this is the way we've done it until now as well.
Since this test is added under template_example and is going to be an example code for users to get inspiration/clues from, so I have made the check/test generic as you suggested. The code now looks like this:
- name: countdash healthcheck test with consul token (when consul acl default policy is deny)
uri:
url: http://localhost:8500/v1/health/checks/count-dashboard
method: GET
headers:
X-Consul-Token: "{{ consul_token }}"
return_content: yes
status_code: 200
body_format: json
when:
- lookup('env', 'consul_acl') | bool
- lookup('env', 'consul_acl_default_policy') == 'deny'
register: result_countdashboard
retries: 1
delay: 1
until:
- result_countdashboard.json | length == 1
- '"warning" not in result_countdashboard | string'
- '"critical" not in result_countdashboard | string'
tags: test
Closes issue?
closes #56
What was added?
Added ansible tasks to perform healthcheck for countdash nomad job
Checklist