eclipse / microprofile-metrics

microprofile-metrics
Apache License 2.0
101 stars 66 forks source link

New exporter format less usable #508

Open rmannibucau opened 4 years ago

rmannibucau commented 4 years ago

The tags are appended to "properties" in v2, not sure what was the rational but tags are related to metrics (since it is owned by the ID) so it should be queryable on the metric so at "name"/key level and not each property level. The main blocker with new format is to ability to create rules on *.<property>.

jmartisk commented 4 years ago

I'm sorry I'm not quite sure I understand your description. We've been appending the tags in the OpenMetrics export in Metrics 1.x as well, so nothing changed in this regard. The main difference in 2.x is that multiple metrics with the same name can be registered (but with different tags), so appending the tags is now necessary to be able to distinguish these different metrics. Filtering of metrics based on their tags (for visualisation and querying purposes) should usually be work of the consumer (typically Prometheus, which supports exactly this approach: https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels - we're just trying to mimic the recommended ways how to present data to Prometheus).

rmannibucau commented 4 years ago

Point is leaf name should be constant whatever tag is there for json exporter, tag should never be in keys but as a key/value array on the object. For prometheus there is no real choice except preppend/append one which should likely be a config but it is less broken than json so less important for me.

donbourne commented 4 years ago

IIRC we put the tags at the end of the name, even in JSON format, so that tools like collectd/graphite could consume them properly. If we were to put the tags into another field in the json then collectd/graphite wouldn't be able to piece together which tags went with each metric. So, while I agree the JSON would be cleaner/better if we factored out the tags, it wouldn't work with the simple scraping tools that are the main consumers of that JSON.

See Feb. 26, 2019 discussion in https://docs.google.com/document/d/1HJOg-NOVpcE5X7qhELnyBa_A7OaVTwbOPCsAOp6F1Zc/edit# for more background.

rmannibucau commented 4 years ago

I strongly disagree, it would be more usable cause dashboards wouldnt depends on the tags and also json are used with pointers/jsonpaths so it does not change anything. Finally for the tools you mentionned, openmetrics endpoint can be more accurate (1-1 in text structure since it does not use a hierarchical object based structure).

donbourne commented 4 years ago

@rmannibucau the issue is, that if the tags aren't part of the key name then collectd can't associate the tag with the metric it collected in any way. that means that multiple metrics with the same key name but different tags would collide. We tried not including the tags in the metric name without success.

rmannibucau commented 4 years ago

Ok, then it means there is no generic format so spec must get a query param specifying the format and either manage a simple templating, an enum with main formats or a spi (query param being the name of the spi impl, spi impl would be looked up in the cdi context). Wdyt?

donbourne commented 4 years ago

@rmannibucau , we do have a mechanism today for specifying which format you want the output to be (using the accept header in the HTTP request). It would fit with the architecture to add more supported formats to that list. Do you have a use case where you are intending to use the JSON output with json paths? IMO the choice would be between:

  1. offering a new JSON format that works well for json query tools
  2. offering SPI for developers to add any new custom formatter/exporter they want to add
  3. do nothing

If we think there's a strong case for JSON query tools to consume the data, then number 1 makes sense. Do we have a use case for that?
If we think we're likely to see a number of new format requests, then number 2 makes sense. If neither of the above is true, then number 3 makes sense.

rmannibucau commented 4 years ago

For me it is 1 but also note that as a spec it is a breaking change - or regression ;) - so there is no real choice to fix it for the community which didnt upgrade yet IMHO. Side note: the header is about the format but not its content (this issue) so a query param works too IMHO and is easier to integrate in pollers (accept stays the same, just the url changes).

donbourne commented 4 years ago

Practically, I wonder if 1 or 2 makes more sense -- it's hard to know if someone will want another format down the road and not all formats will make sense to add to the spec.

What we did for MP Metrics 2.x was tested to work with collectd, but I get your point that some people that were able to work with the 1.x metrics JSON format may not be able to work with the 2.x metrics JSON format. In 1.x we only put the (global) tags in the OPTIONS output. In 2.x we put the (global or time-series-specific) tags on the end of each metric. The 2.x change was to be able to support multi-dimensioned metrics, but of course there were different options for how to do that. Had we chosen to not put the tags on each metric, tools like collectd would not have been able to differentiate multi-dimensioned metrics.

On your point about accept header vs. query string, perhaps we should have an optional query string parm that would allow for choosing the format (would have to give thought to how to handle the case where someone provides both a query string and an accept header).

rmannibucau commented 4 years ago

Thinking out loud but enabling a bean which is a mapper - Function - mapping the metric meta to the key would work to. It is just a cdi lookup + @priority sorting in terms of impl and fixes it (one app never needs both formats at the same deployment time).

So giving some callbacks on the serialization can be better overall.

Just some food for thoughts.

donbourne commented 4 years ago

@rmannibucau for the record, can you state what your use case is for having another JSON format (or perhaps for wanting to use json query tools with the metrics)? I'm not pushing back on that -- just want it to be documented to help explain this capability if we deliver it.

rmannibucau commented 4 years ago

@donbourne tools are configured to query/extract data using the suffix (the part holding the actual information the prefix varying depending the actual instance - like cpu). This was working fince until last major, i'd just like the ability to keep the tools/dashboard/aggregators configured the same whatever the env and tags are.

donbourne commented 4 years ago

are you using a specific tool? or something more ad-hoc like curl and scripting? I do get your point about suffixes.

rmannibucau commented 4 years ago

@donbourne the 3 tools are based on curl likes solution (one is an actual sh with curl, another one is a java based solution with a json extractor, I know less about the last one but from what I understood from the ops it is a graphite http client - but I'm not sure it is on the dashboard or storage side the issue appears)