haskell-github-trust / ekg-core

Library for tracking system metrics
BSD 3-Clause "New" or "Revised" License
40 stars 39 forks source link

Implement tagged metrics #17

Closed lucasdicioccio closed 6 years ago

lucasdicioccio commented 7 years ago

This is an experiment to address #6 and could help for #12 . This change uses a non-empty list of tags instead of single tag name for metrics.

The main design choice to figure out is whether tags should be key/values or values only. In this experiment I use values only.

Pros:

Cons:

tibbe commented 7 years ago

Thanks for putting this together. I've added some high-level thoughts below. When you read through these it might help having this example use case in mind: the user wants to count the number of HTTP status codes for each URL requested in some (Haskell) web server. The counter would be used as follows (in pseudo-ish code):

main = do
    counter <- createCounter "myapp.request-status" ["url", "status"]
    setStatusCounter counter
    ...

requestEntryPoint :: Url -> IO ()
requestEntryPoint url = do
    counter <- getStatusCounter
    statusCode <- handle url
    MultiCounter.inc counter [url, status]

The latter suggests that we add new counter, etc. types that have some internal structure (e.g. they might internally be implemented as maps of regular counters).

lucasdicioccio commented 7 years ago

Sorry for the delay, pretty busy :-). So far I went for the simplest diff I could imagine for lack of time and a goal to keep the overhead minimal.

Even if we iterate, I still want to keep the ability to make a ref <- MultiCounter.getCounter counter [url, status] and use inc ref later in the code. Also for performance, I'd like to have a single hashmap for all dimensional breakdowns of all counters when we sampleAll rather than having nested hashmaps (maybe we can have both at a small RAM cost? -- I'll have to dive in the code again to tell).

Then, regarding the lookupOrCreate >=> increment functions I agree they belong to EKG but I'll focus on them in a 2nd step as I feel they are sugar above primitives. I agree that keeping the dimension names separate from the tags is useful as well.

tibbe commented 7 years ago

I think supporting ref <- MultiCounter.getCounter counter [url, status] makes sense. If we implement multi-dimensional metrics with a map from the key to a regular Counter (or Distribution or whatever) you should be able to do just that.

sampleAll needs to be reasonably cheap (e.g. cheap enough to be called once per second or so), but I don't want the returned data to be ambiguous (i.e. does "a.b.c" refer to a metric "a.b" with a dimension with the value "c" or a metric "a.b.c" that has no dimension?). One reason is that we might want to push it to other backends than statsd.

lucasdicioccio commented 6 years ago

Hi, it's me again. I've created https://github.com/tibbe/ekg-core/pull/22 which should supersede this MR, taking a pretty different approach and providing an API closer to what you described.