omgnetwork / omg-childchain-v2

pronounced /Ch-ch/
Apache License 2.0
5 stars 2 forks source link

feat: gather stats on x-watcher-version header #180

Closed pgebal closed 3 years ago

pgebal commented 3 years ago

References https://github.com/omgnetwork/elixir-omg/issues/1710.

Changes:

pgebal commented 3 years ago

We don't differentiate between different endpoints being called. We gather occurrences for all endpoints. I think it can stay like that. Let me know if we need more granularity, and for example count also by endpoints.

macaptain commented 3 years ago

Instead of counting responses in the Childchain, have you considered counting requests in the Watcher? It should be straightforward to add this metric in Watcher. You won't have to worry about validating the value in the header, and you will be able to get the response code, error and version of the Childchain at the site of the call as tags for the metric.

pgebal commented 3 years ago

Instead of counting responses in the Childchain, have you considered counting requests in the Watcher? It should be straightforward to add this metric in Watcher. You won't have to worry about validating the value in the header, and you will be able to get the response code, error and version of the Childchain at the site of the call as tags for the metric.

@macaptain What about watchers that aren't run by us? We can't gather metrics on those.

macaptain commented 3 years ago

Instead of counting responses in the Childchain, have you considered counting requests in the Watcher? It should be straightforward to add this metric in Watcher. You won't have to worry about validating the value in the header, and you will be able to get the response code, error and version of the Childchain at the site of the call as tags for the metric.

@macaptain What about watchers that aren't run by us? We can't gather metrics on those.

That is a great point.

The only thing I have to add then is that I recently added a way to collect metrics based on telemetry events to Watcher and Childchain v1: https://github.com/omgnetwork/elixir-omg/pull/1766. I will port this to Childchain v2. The Phoenix framework already spits a telemetry event including conn as metadata whenever the Router dispatches (see https://hexdocs.pm/phoenix/Phoenix.Logger.html).

Rather than defining a new Plug, you could instead add a metric item like this: https://github.com/omgnetwork/elixir-omg/blob/d0b86c8e75d5bd4db766e86c3d527a2b276adf87/apps/omg_status/lib/omg_status/metric/telemetry.ex#L73 together with :tag_values to get the header you want out of the conn. You'd get the latency, error code, Childchain version as additional tags as well. Documentation for converting Phoenix's telemetry event to a metric is here: https://hexdocs.pm/telemetry_metrics_statsd/TelemetryMetricsStatsd.html.

Let me know if you think this approach is worthwhile.

pgebal commented 3 years ago

Instead of counting responses in the Childchain, have you considered counting requests in the Watcher? It should be straightforward to add this metric in Watcher. You won't have to worry about validating the value in the header, and you will be able to get the response code, error and version of the Childchain at the site of the call as tags for the metric.

@macaptain What about watchers that aren't run by us? We can't gather metrics on those.

That is a great point.

The only thing I have to add then is that I recently added a way to collect metrics based on telemetry events to Watcher and Childchain v1: omgnetwork/elixir-omg#1766. I will port this to Childchain v2. The Phoenix framework already spits a telemetry event including conn as metadata whenever the Router dispatches (see https://hexdocs.pm/phoenix/Phoenix.Logger.html).

Rather than defining a new Plug, you could instead add a metric item like this: https://github.com/omgnetwork/elixir-omg/blob/d0b86c8e75d5bd4db766e86c3d527a2b276adf87/apps/omg_status/lib/omg_status/metric/telemetry.ex#L73 together with :tag_values to get the header you want out of the conn. You'd get the latency, error code, Childchain version as additional tags as well. Documentation for converting Phoenix's telemetry event to a metric is here: https://hexdocs.pm/telemetry_metrics_statsd/TelemetryMetricsStatsd.html.

Let me know if you think this approach is worthwhile.

Thanks! I'll take a look on that.

macaptain commented 3 years ago

Thanks! I'll take a look on that.

The new metrics using telemetry have been merged to this repo here: https://github.com/omgnetwork/omg-childchain-v2/pull/181