Since the early inception of hush-house, we've been capturing error rates for
workers through the use of Stackdriver's user defined log-based metrics (see
Overview of logs-based metrics)
That has served us very well as it'd put 0 load in our Prometheus server as
those were coming all from stackdriver, as well as shift our approach to
serching for "when did errors start?" to a much more direct "just look at the
dashboard".
Now that we added nci to the hush-house GKE cluster, which is also hooked up
w/ stackdriver for logs, it'd be great to leverage the same capabilities.
The only problem that we face with it is the fact that it has hush-house
hardcoded in the log filtering.
It might be possible to have that as a wildcard (on Stackdriver) and perform
the filtering at the client (Grafana), but I personally never verified that.
Hey,
Since the early inception of
hush-house
, we've been capturing error rates forworkers
through the use of Stackdriver's user defined log-based metrics (see Overview of logs-based metrics)That has served us very well as it'd put 0 load in our Prometheus server as those were coming all from
stackdriver
, as well as shift our approach to serching for "when did errors start?" to a much more direct "just look at the dashboard".Now that we added
nci
to thehush-house
GKE cluster, which is also hooked up w/ stackdriver for logs, it'd be great to leverage the same capabilities.The only problem that we face with it is the fact that it has
hush-house
hardcoded in the log filtering.It might be possible to have that as a wildcard (on Stackdriver) and perform the filtering at the client (Grafana), but I personally never verified that.
Thanks!