Closed lucasdicioccio closed 6 years ago
Thanks for putting this together. I've added some high-level thoughts below. When you read through these it might help having this example use case in mind: the user wants to count the number of HTTP status codes for each URL requested in some (Haskell) web server. The counter would be used as follows (in pseudo-ish code):
main = do
counter <- createCounter "myapp.request-status" ["url", "status"]
setStatusCounter counter
...
requestEntryPoint :: Url -> IO ()
requestEntryPoint url = do
counter <- getStatusCounter
statusCode <- handle url
MultiCounter.inc counter [url, status]
Distribution
) given a value for each dimension (as in the example above).The latter suggests that we add new counter, etc. types that have some internal structure (e.g. they might internally be implemented as maps of regular counters).
Sorry for the delay, pretty busy :-). So far I went for the simplest diff I could imagine for lack of time and a goal to keep the overhead minimal.
Even if we iterate, I still want to keep the ability to make a ref <- MultiCounter.getCounter counter [url, status]
and use inc ref
later in the code. Also for performance, I'd like to have a single hashmap for all dimensional breakdowns of all counters when we sampleAll
rather than having nested hashmaps (maybe we can have both at a small RAM cost? -- I'll have to dive in the code again to tell).
Then, regarding the lookupOrCreate >=> increment
functions I agree they belong to EKG but I'll focus on them in a 2nd step as I feel they are sugar above primitives. I agree that keeping the dimension names separate from the tags is useful as well.
I think supporting ref <- MultiCounter.getCounter counter [url, status]
makes sense. If we implement multi-dimensional metrics with a map from the key to a regular Counter
(or Distribution
or whatever) you should be able to do just that.
sampleAll
needs to be reasonably cheap (e.g. cheap enough to be called once per second or so), but I don't want the returned data to be ambiguous (i.e. does "a.b.c" refer to a metric "a.b" with a dimension with the value "c" or a metric "a.b.c" that has no dimension?). One reason is that we might want to push it to other backends than statsd.
Hi, it's me again. I've created https://github.com/tibbe/ekg-core/pull/22 which should supersede this MR, taking a pretty different approach and providing an API closer to what you described.
This is an experiment to address #6 and could help for #12 . This change uses a non-empty list of tags instead of single tag name for metrics.
The main design choice to figure out is whether tags should be key/values or values only. In this experiment I use values only.
Pros:
Cons: