Closed moritzheiber closed 6 years ago
Hi! Adding this now, this is definitely I know of being useful (we just don't use it for Crisp).
The response check will be an exact content check, eg. if you match "OK" and the server replies "200 OK" it won't match. Hope that will do for you.
By the way, may I ask how did you know about Vigil? It's quite recent (was fully released yesterday) so I'm quite surprised to see such an early feedback 😄
Also, before I start coding this, do you see this useful as a global configuration common to all HTTP poll
probes, or rather a per-node configuration option?
Regarding your first question (re: matching), I think it would be useful to support at least two different types of matches, or even a regular expression matching. For example, I have a service returning a JSON response with a couple of service information in there. Now, I could match on all of them being "Okay" or I could match several different conditions. The latter appears to be more useful to me.
I love the way Hashicorp structures their health endpoints, and I'd be glad to be able to make use of the data is provides, i.e. differentiating between "the cluster isn't reachable" and "I'm sealed" or "I'm uninitialized" using the example in the Vault API docs.
Second question (re: Vigil quite recent): I regularly scour GitHub for interesting repositories which could help me with my job, and I actually thought about writing something similar in Rust myself. I'm glad I found Vigil :smile:
Third question (re: all/node configuration): IMHO, it should be a per-node configuration option, as most likely, different HTTP nodes are going to respond to requests differently (at least that's the model I'm used to).
Got that. That makes sense, yes. I'm going for the Regex version as this one definitely provides more flexibility.
Didn't know you did Rust also, I'm glad I spared you a few days of work 😄
As this is a feature you need on your end, and thus you know the edgy details, you may PR Vigil on this. Otherwise I can do it, but it might not be 100% tuned to what you need / what other people need (I'm not familiar with eg. Hashicorp though I knew them by name). Let me know what's best!
I won't have the time for it right now, as I just started managing a new project internally. I'd be happy to send PRs later should I need any additional functionality.
Here you go: 65e1b1c794357d85f38b937c4832a1b965b7fa34
Let me know if that works for you. When the http_body_healthy_match
option is configured for a poll
node in HTTP mode, the regex will be used to check for health if and only if the status code check passed.
Also, I didn't release a version for this, waiting for feedback. So pull master and compile it on your side 🎉
Sorry, I was too busy yesterday, I'll take a look at it today!
Short feedback: Either I'm unable to correctly use the functionality as the documentation suggests or it doesn't work. Here's the configuration snippet I tried:
[...]
[[probe.service.node]]
id = "Test"
label = "Test"
mode = "poll"
replicas = ["https://www.ping.eu"]
http_body_healthy_match = ".*\"status\":\"blibber\".*"
Trying to match some JSON here
$ curl -L https://www.ping.eu | grep blibber
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 12295 0 12295 0 0 12295 0 --:--:-- --:--:-- --:--:-- 141k
$
The debug log:
$ vigil -c config.cfg
(INFO) - starting up
(DEBUG) - prober store: got service web
(DEBUG) - prober store: got node web:Test
(DEBUG) - prober store: got replica web:Test:https://www.ping.eu
(INFO) - initialized prober store
(DEBUG) - spawn managed thread: responder
(DEBUG) - spawn managed thread: aggregator
(INFO) - 🔧 Configured for production.
(INFO) - address: 0.0.0.0
(INFO) - port: 8080
(INFO) - log: critical
(INFO) - workers: 4
(INFO) - secret key: generated
(INFO) - limits: forms = 32KiB
(INFO) - tls: disabled
(WARN) - environment is 'production', but no `secret_key` is configured
(DEBUG) - spawn managed thread: prober
(DEBUG) - running an aggregate operation...
(DEBUG) - aggregate probe: web
(INFO) - [extra] template_dir: "./res/assets/./templates"
(DEBUG) - running a probe operation...
(DEBUG) - aggregate node: web:Test
(DEBUG) - aggregated status for replica: web:Test:https://www.ping.eu => Healthy
(DEBUG) - aggregated status for node: web:Test => Healthy
(DEBUG) - aggregated status for probe: web => Healthy
(INFO) - ran aggregate operation (notified: false)
(INFO) - 🛰 Mounting '/':
(DEBUG) - will probe replica: HTTPS("https://www.ping.eu/") with retry count: 1
(INFO) - GET /
(INFO) - POST /reporter/<probe_id>/<node_id> application/json
(INFO) - GET /robots.txt
(INFO) - GET /badge/<kind>
(INFO) - GET /assets/fonts/<file..>
(INFO) - GET /assets/images/<file..>
(INFO) - GET /assets/stylesheets/<file..>
(ERROR) - 🚀 Rocket has launched from http://0.0.0.0:8080
(DEBUG) - threads = 4
(DEBUG) - prober poll will fire for http target: https://www.ping.eu/?1516825618
(DEBUG) - loop poll - Duration { secs: 0, nanos: 6948 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 288289089 }
(DEBUG) - consuming notification queue
(DEBUG) - loop process - 2 events, Duration { secs: 0, nanos: 74145 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1451 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 288389255 }
(DEBUG) - loop process - 0 events, Duration { secs: 0, nanos: 14223 }
(DEBUG) - resolving host="www.ping.eu", port=443
(DEBUG) - loop poll - Duration { secs: 0, nanos: 240185359 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 528595814 }
(DEBUG) - connecting to 88.198.46.60:443
(DEBUG) - adding a new I/O source
(DEBUG) - scheduling direction for: 0
(DEBUG) - blocking
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 175897 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1687 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 528781315 }
(DEBUG) - loop process - 0 events, Duration { secs: 0, nanos: 14774 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 12246940 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 541049805 }
(DEBUG) - notifying a task handle
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 49209 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 10510 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 541116542 }
(DEBUG) - scheduling direction for: 0
(DEBUG) - blocking
(DEBUG) - scheduling direction for: 0
(DEBUG) - blocking
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 337896 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1897 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 541467956 }
(DEBUG) - loop process - 0 events, Duration { secs: 0, nanos: 16506 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 17082591 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 558575835 }
(DEBUG) - notifying a task handle
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 117994 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 7420 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 558728341 }
(DEBUG) - scheduling direction for: 0
(DEBUG) - blocking
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 2141955 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 11151 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 561141937 }
(DEBUG) - loop process - 0 events, Duration { secs: 0, nanos: 385756 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 18178047 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 579822400 }
(DEBUG) - notifying a task handle
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 51545 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 6761 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 579890640 }
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 215515 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1738 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 580117793 }
(DEBUG) - flushed 111 bytes
(DEBUG) - scheduling direction for: 0
(DEBUG) - blocking
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 131926 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1631 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 580259435 }
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 15243 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1089 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 580282968 }
(DEBUG) - loop process - 0 events, Duration { secs: 0, nanos: 13774 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 14213772 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 594518178 }
(DEBUG) - notifying a task handle
(DEBUG) - loop process - 1 events, Duration { secs: 0, nanos: 58797 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 8769 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 594592973 }
(DEBUG) - read 418 bytes
(DEBUG) - parsed 8 headers (418 bytes)
(DEBUG) - incoming body is content-length (0 bytes)
(DEBUG) - loop process - 2 events, Duration { secs: 0, nanos: 365164 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 2834 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 594980020 }
(DEBUG) - Response: '200 OK' for https://www.ping.eu/?1516825618
(DEBUG) - consuming notification queue
(DEBUG) - dropping I/O source: 0
(DEBUG) - loop process - 2 events, Duration { secs: 0, nanos: 73139 }
(DEBUG) - loop poll - Duration { secs: 0, nanos: 1660 }
(DEBUG) - loop time - Instant { tv_sec: 66171, tv_nsec: 595062100 }
(DEBUG) - loop process - 0 events, Duration { secs: 0, nanos: 28137 }
(DEBUG) - prober poll result received for url: https://www.ping.eu/?1516825618 with status: 200
(DEBUG) - replica probe result: web:Test:https://www.ping.eu => Healthy
(INFO) - ran probe operation
It doesn't seem to me the regexp is used at all? Or is it a hidden error?
Note: I find it interesting that "The rocket has launched from" is classified as ERROR
PS: The configuration parser really doesn't like missing parts (like [notify]
or [plugins]
) in the configuration file
Mhh, strange. Worked fine for me. I guess the response cannot be unpacked, which is why it stays silent.
I've added more log points, can you pull master, compile and test again? (added on 9c8aa828171c56b22252a32816372145601f0c8b).
Also, I've addressed your 'PS' to make [notify]
and [plugins]
optional, as they certainly are not used for some uses (in 6ba5b57430c8a56b79f58d184cd6d4b4bae57fd9).
I'll release a new version (with Debian 8 builds) when everything is all good for your use case.
I took a look at my checkout again and it appears I had the wrong tree checked out. With HEAD everything is working as expected, I thoroughly apologize. :+1:
Optional configuration parameters are working as well, I'd consider this solved! :tada:
All right, issuing the release 👍
Done, v1.2.0 is out!
FYI on your remark on Rocket's startup log visible in Vigil logs, I've issued a report on SergioBenitez/Rocket/issues/553.
One of the greatest features I see implemented by comparable services is the ability to check the output of the probe sent to an HTTP endpoint for arbitrary strings instead of just HTTP response codes. I would love have this feature implemented for Vigil.