prometheus / OpenMetrics

Evolving the Prometheus exposition format into a standard.
https://openmetrics.io
Apache License 2.0
2.35k stars 171 forks source link

Standalone OpenMetrics Text Parser #169

Open rakyll opened 3 years ago

rakyll commented 3 years ago

OpenMetrics, when adopted widely, will be the export format of various endpoints that need to be auto-discovered and scraped. Currently, there isn't an official parser library for OpenMetrics other than the ongoing work on Prometheus. Prometheus provides discovery and scraping libraries to enable non-Prometheus programs to discover and scrape Prometheus endpoints. But the these libraries report data points as deltas and makes the consumer to build a state machine and aggregate the deltas in order to report the collected series. This sometimes makes it hard to write new tools to discover, parse and ingest metrics. To avoid some of these problems, I propose that we should build a standalone OpenMetrics parser so the adoption is not limited.

Challenges

A standalone parser would be useful in the following cases:

Proposal

Let’s create a standalone parser library that will parse the text format to Protocol Buffers. Consumers of OpenMetrics can rely on the parser library if proto exposition is not available. If metrics are exposed in protos, scraper should prefer to fetch them in proto instead of parsing the text format.

Alternatives Considered

RichiH commented 3 years ago

Datapoint/FYI: Making protobuf mandatory has been discussed, in particular for large-scale deployments, and discarded for precisely that reason. Anyone with the scaling needs will be able to implement proto, most likely on both ends.

That being said, a standard parser makes sense.

I am unsure if this should live in prometheus/ or openobservability/ and can see good reasons for either. As long as it's done in consensus with Prometheus-team, it's just a name in an URL anyway.

juliusv commented 3 years ago

My 2c from the peanut gallery: Sounds good, and I'm for putting it into the OM org vs. the Prometheus org. If you compare it with https://github.com/open-telemetry, OT also has reference client library implementations for various languages, and a similar thing could make sense for OM as well. Then OM could eventually become a home for OM parsers (and serializers) for all kinds of languages.

SuperQ commented 3 years ago

Prometheus provides discovery and scraping libraries to enable non-Prometheus programs to discover and scrape Prometheus endpoints. But the these libraries report data points as deltas and makes the consumer to build a state machine and aggregate the deltas in order to report the collected series.

Can you be more specific about what libraries you're talking about here? Prometheus specifically does not use or store deltas, and discourages the use of deltas in favor of raw data. The Prometheus design is very much centered around collecting and storing raw data without interpretation.

rakyll commented 3 years ago

@SuperQ, I was referring to the scraping library from Prometheus. It reports data points via storage.Appendable, see https://godoc.org/github.com/prometheus/prometheus/scrape#NewManager. The non-Prometheus scrapers end up implementing a state machine to handle the deltas.

SuperQ commented 3 years ago

I'm pretty sure that appends samples, not deltas. In OpenMetrics terms this would be a MetricPoint. This is a raw value, not a delta.

@juliusv, can you answer this question?

juliusv commented 3 years ago

The scraping layer in https://github.com/prometheus/prometheus/blob/master/scrape/scrape.go does a lot of stuff in addition to the core parser to perform Prometheus-specific actions such as staleness tracking, health metrics reporting, and caching of time series IDs for faster subsequent appends. As far as I know, there is no sample value delta tracking anywhere in there, but maybe the time series ID tracking or a similar thing was meant by that. In any case, the scrape layer keeps a lot of state between scrapes to accomplish those things.

However, the core OM parser (not the full scraper) is at https://github.com/prometheus/prometheus/blob/master/pkg/textparse/openmetricsparse.go and AFAIK doesn't require any such state tracking at all. So I hope it would be relatively easily reusable outside of Prometheus. Still, it can make sense to move it to this org IMO. Or fork it here, in case we need a slightly different implementation in Prometheus.

brian-brazil commented 3 years ago

Currently, there isn't an official parser library for OpenMetrics That being said, a standard parser makes sense.

There is already a standard standalone parser for OpenMetrics. https://github.com/prometheus/client_python is the reference implementation for both parsing and exposition, and is already used outside of Prometheus.

It reports data points via storage.Appendable, see https://godoc.org/github.com/prometheus/prometheus/scrape#NewManager.

That's an internal library of Prometheus, and not suitable for general usage as it's is very tightly bound to the Prometheus TSDB for performance reasons. I would not recommend using it as a base for general OpenMetrics parser, among other things it doesn't do the validation and data modelling that an official OpenMetrics parser should.

rakyll commented 3 years ago

does a lot of stuff in addition to the core parser to perform Prometheus-specific actions such as staleness tracking, health metrics reporting, and caching of time series IDs for faster subsequent appends. As far as I know, there is no sample value delta tracking anywhere in there, but maybe the time series ID tracking or a similar thing was meant by that. In any case, the scrape layer keeps a lot of state between scrapes to accomplish those things.

My choice of vocabulary was wrong when calling them deltas. This is the situation I was describing.

brian-brazil commented 3 years ago

My choice of vocabulary was wrong when calling them deltas. This is the situation I was describing.

Yeah, there's lot of specialised logic in there.

For Go I'm currently expecting that we'll end up with a full parser and exposition in https://github.com/prometheus/common/tree/master/expfmt, which is where the official Prometheus text format ones live - Prometheus itself stopped using the official Prometheus text format parser in 2.0.0. Prometheus will want a full parser for promtool.

Java exposition is next on my todo list, but I'd be happy to review a PR to add a full parser into common. We'll probably want to share the validation between the proto and text formats.

SuperQ commented 3 years ago

@rakyll Thanks for the clarifications. I agree, it's a good idea to get some "easy to import" libraries to make interoperability easier.