erlef / observability-wg

Project for tracking the work of the Observability Working Group
Creative Commons Attribution Share Alike 4.0 International
61 stars 9 forks source link

Survey existing metrics definitions across existing libraries #3

Open tsloughter opened 5 years ago

tsloughter commented 5 years ago

From the meeting notes where this action item was created:

hauleth commented 5 years ago

Currently using Structs and Protocols, so hard to convert to Erlang

I haven't found any usage of protocols in telemetry_metrics. Heave I missed something? About structs, as Elixir provides quite easy support for records (without support for protocols though). I think it shouldn't be much of the problem.

Intention with Phoenix 1.5 is to include this by default, so might not be as seamless

Erlang implementation still can provide Elixir-like API. BTW the same should be done for telemetry itself to provide more seamless migration for consumers.

How will this interact with OpenTelemetry’s metrics feature set?

I would suggest that we would ignore direct API in OT and instead "force" user to always use telemetry for sending data to OT which should be only consumer. In that way we would sacrifice some part of the OT specs for better user experience.


About existing metrics types, most common I am aware of are:

Some other tools also provide metrics like meter which work like taking derivative of gauge, but I think it is out of scope for telemetry_metrics.

arkgil commented 5 years ago

BTW the same should be done for telemetry itself to provide more seamless migration for consumers.

Do you mean creating an Elixir module delegating to the Erlang one?

hauleth commented 5 years ago

@arkgil yes. It could even be written in Erlang, but in general it should be made easy for consumers to "migrate" to newer versions.

arkgil commented 5 years ago

@hauleth I'm not sure what you mean, or maybe I don't see the problem we're trying to solve here 😄

Regarding use of records, I would vote against it, because IMO they are problematic when they show up in stacktraces. I would say that if we aim to have a common structure for both Erlang and Elixir, then maps are the way to go (they might be structs on the Elixir side, although that too might confuse folks when debugging from Erlang).

arkgil commented 5 years ago

As Łukasz wrote in a comment above, metric types supported by the libraries around fall into following buckets:

When it comes to defining metrics, most of the libraries use the approach with the "registry". You call a function, the metric is registered somewhere globally, and the registry is queried whenever the metric is updated or needs to be exported. I haven't found library other than Telemetry.Metrics which uses plain data structures for defining metrics and passing them around.

bryannaegele commented 5 years ago

I haven't found library other than Telemetry.Metrics which uses plain data structures for defining metrics and passing them around.

How many of those are attempting to interact with multiple implementations without the use of an agent though? I see one of the benefits of using data structures to define metrics is the flexibility they provide for simple migrations via reporters. OpenCensus is the only one I'm aware of that attempts abstracting the destination but moves that abstraction to the agent.

arkgil commented 5 years ago

exometer, folsom and metrics (which uses first two as backends) are all quite popular (assessing by number of downloads on Hex) and allow to export metrics to multiple external systems. The idea is that reporters subscribe to metric updates and are notified every x seconds that they should export the metric.

arkgil commented 5 years ago

To me, the difference between using a registry and data structures boils down to these two things:

  1. With data structures, we need to tell the reporter which metrics it shall export. With the registry we can register metrics earlier and either tell it which ones it should use or which ones it should ignore.
  2. With data structures it's not possible for libraries to register metrics, only emit events using Telemetry, which gives more control to the user. With registry, libraries could register metrics directly so that the user can export them.