Dynamically creating metrics

MaxGabriel commented 8 years ago

Sometimes you want to generate metrics on the fly. Example use cases:

You don't want to register metrics before hand since you have a lot of them, instead you'd like a fire-and-forget "create this counter if it doesn't exist, and increment it" operation. (Example code). I think this would help for use cases like this, as well.
Your metric name may be parameterized by some value in your function (this is pretty common)

Currently ekg-core makes it difficult to dynamically create metrics. Reasons:

If you try to register a metric that already exists, you'll get a runtime error.
There isn't a way of getting a reference to an already created Counter from your Store, you need to track that yourself.

As a workaround to this, for example, @tfausak keeps an IORef of

data Metrics = Metrics
    { metricsCounters :: IORef.IORef (Map.Map Text.Text Counter.Counter)
    , metricsDistributions :: IORef.IORef (Map.Map Text.Text Distribution.Distribution)
    , metricsGauges :: IORef.IORef (Map.Map Text.Text Gauge.Gauge)
    , metricsServer :: Maybe Monitoring.Server
    , metricsStore :: Metrics.Store
    }

in his Blunt package (used for pointfree.info).

Is this a problem? I think so: it's a hassle for users to keep a map of all their metrics, especially when the Store is pretty much doing that already.

Proposed solution

ekg-core provides an API to lookup already registered metrics. Probably a lookupOrCreate type of API should be added as well. The main complication is that ekg-core doesn't actually track the Counters, Distributions, Labels or Gauges you make, it tracks this data:

-- TODO: Rename this to Metric and Metric to SampledMetric.
data MetricSampler = CounterS !(IO Int64)
                   | GaugeS !(IO Int64)
                   | LabelS !(IO T.Text)
                   | DistributionS !(IO Distribution.Stats)

Thoughts? I imagine the choice to track CounterS !(IO Int64) instead of the actual Counter was intentional, so maybe I'm missing something.

tfausak commented 8 years ago

I think this would be nice to have. I could extract the stuff in Blunt out to a separate library as a proof of concept.

tfausak commented 8 years ago

I haven't tested it yet, but I was thinking something like this: https://gist.github.com/tfausak/ff1a59b91ba540912fa90cd71bf5f76b

The unsafe IORefs are a problem. Without being able to hang extra information on the Store, I couldn't think of a good way to track the current metrics.

tibbe commented 8 years ago

ekg used to do this for its getX functions, but I changed it at some point and decided to let others add that kind of functionality on top of ekg. I do have plans to add multidimensional counters however, which would let you do things like this:

c <- createCounter "request_by_url" :: Counter Text  -- every counter is indexed by a string
Counter.increment "some-url" c

I think this is the right way to support dynamism in metric names.

tfausak commented 8 years ago

That approach does look nice. I think my use cases would be covered by that.

But why don't you want to allow non-namespaced dynamic metrics?

tibbe commented 8 years ago

@tfausak I don't know if I would go as strongly as "not allow", but at work we've found that all dynamically named metrics we're really just single metrics with dimension(s). The grouping is very useful for analysis, storage, etc.

lucasdicioccio commented 7 years ago

Does anyone have some unpublished progress/design about the multidimensional metrics? I don't mind sketching something. The coding bit is probably easy enough. The hard part is finding a good tradeoff. I've spent some timg thinking about what are the hard choices to make:

(A) do we add a new field to the State record? This question is more about API stability than design (e.g., if we do (B) we don't need a new field except if we want to be extra-careful).
(B) do we encode keys in names (e.g., after a delimiter)? If we encode keys in ekg-core then the API changes are limited. Dimensions seemlessly become new names. Then, there's some benefit in computing a new key once for the whole lifetime of the program. It's already the case that EKG users must track the name <> ref mapping by their own means. However, using multiple-dimensions the typical use case will not be to register metrics for the cartesian product of dimensions ahead of time (if dimensions are even bounded). It's fine to tell users that EKG trades off some usability for efficiency.
(C) do we want to provide ways to garbage-collect these labels? There's already a memory-growing-unbounded/DoS risk if someone name metrics based on user-controlled input. The use case for dimensional metrics makes the trap more attractive. A bit like (B) I'd say this is a tradeoff and we can add a big warning in the documentation.
(D) do we want to add key-value tags (like Prometheus), a list of tags, or a set of tags? List/sets of tags would force users to encode/decode information if they want key/values semantics but they could be cheaper implementation wise (in particular, if we do (B) then one needs to pay attention to ordering).
(E) do we want users to pass dimensions via a type or via a value? We haskellers love to put some information in the type system. If we encode tag dimensions in the type system and values in the value world: there's a chance we can find a trick to gain performance by never representing dimensions at runtime but I think the real gain is to get an expressive Metrics signature -- which ensures users don't mix dimensions. It could be limiting though if we decide API should be able to craft new dimensions at runtime (I believe it's wrong). Also, when sampling values type-information will (in all likelihood) be lost.

tfausak commented 7 years ago

I don't have any answers for you, but check out this package for multidimensional metrics: https://github.com/sellerlabs/monad-metrics

lucasdicioccio commented 7 years ago

I don't believe monad-metrics introduces what tibbe refers to as multidimensional metrics. My understanding of mutlidimensional metric is that counters are identified by Text+dimensions rather than just Text. Glancing at the code, monad-metrics provides a mechanism to identify Metrics by name in a monadic context (i.e., avoid the boilerplate of passing Metrics around -- it reads way better but costs a bit of overhead). However monad-metrics does address multidimensional counters. We already have some mechanism like monad-metrics at my job, with a very similar solution; so too bad I learn about this package only now :-). What I think would be useful is a way to build say, a counter of number of requests per domain and per response-code. In a typed Counter world that would give something like Counter '[Domain, HTTPResponseCode] (e.g., with a type-level list) to ensure counters are properly typed. We can sugar-coat a nicely-typed API on top of an unityped module. Hence, I believe the big design tradeoff is standardizing a way to identify dimensions in the sampleAll output (and then in the JSON output). In my example we could encode dimension in Metric names such as counter/<domain>/<reponse-code> (ordered tags); counter{domain=<domain>,response-code=<responde-code>} (named tags); counter {domain=<domain>,response-code=<response-code>} (text-space-JSONdict) or anything else I've not thought about. Then implementation-wise there could be some tricks to play to avoid overhead, making sure dict keys ordering does not matter etc. I don't mind reserving some syntax in counter names for dimension tags. That said, I think dimension support should be built in ekg-core; in a way that retain some of the dimension information in the Sample datatype. (Otherwise one has to parse counter names to extract dimensions back).

tibbe commented 7 years ago

Basically what @lucasdicioccio said.

I imagine a new API in ekg-core, lets call it System.Metrics.MultiDim.Counter for the sake of this example. (None of this compiles as is, but hopefully gives the idea)

new :: [Text]  -- ^ Name of dimensions
    -> IO Counter

inc :: Counter -> [Text] -> IO ()

example = do
    requests <- new ["domain", "response_code"]
    inc requests ["www.somedomain.com", "200"]

This is probably not as strongly typed as we'd like, so we can play e.g. HList tricks (but we should keep the dimension names around, perhaps in the counter implementation, because they are useful for serialization, etc). I think we need to require that the dimension names can all be converted to strings, for serialization to JSON and generally sending metrics around in systems.

I think the dimensions should be ordered, because being able to easily reduce the number of dimensions by one "summing" the last one is useful. For example, in the above code we could count the number of requests (with any status codes) by summing over the last dimensions.

There are questions about how we push this data to statsd etc. but I think this the general direction we'd like to go in.

lucasdicioccio commented 7 years ago

Thanks for your input @tibbe I'll sketch something soon using Text as counter-creation keys and a type-aware sugar coating (which can go in other packages, so people can experiment with a KnownSymbol or a Typeable layer).

lucasdicioccio commented 7 years ago

I've started a PoC at https://github.com/lucasdicioccio/ekg-core . I think it's fine for ekg-core, though, ekg and ekg-json will need an update at the same time.

As far as I'm testing: i don't have a way to test the full compatibility matrix with base/hashable and I use Stack. I've added a stack.yaml (not committed) using lts-8.5 .

L0neGamer / ekg-core

Dynamically creating metrics #12

Proposed solution