Standalone OpenMetrics Text Parser

rakyll commented 3 years ago

OpenMetrics, when adopted widely, will be the export format of various endpoints that need to be auto-discovered and scraped. Currently, there isn't an official parser library for OpenMetrics other than the ongoing work on Prometheus. Prometheus provides discovery and scraping libraries to enable non-Prometheus programs to discover and scrape Prometheus endpoints. But the these libraries report data points as deltas and makes the consumer to build a state machine and aggregate the deltas in order to report the collected series. This sometimes makes it hard to write new tools to discover, parse and ingest metrics. To avoid some of these problems, I propose that we should build a standalone OpenMetrics parser so the adoption is not limited.

Challenges

A standalone parser would be useful in the following cases:

When metrics are needed to be scraped by intermediate tools. There are a variety of tools that want to rely on user metrics for better decisions. For example, load balancers can rely on custom user metrics. Autoscalers can dynamically scale the number of replicas based on user metrics.
In large clusters, aggregation of collected metrics is a common approach before reporting them. Collecting OpenMetrics metrics, aggregating them in a custom intermediate component and exposing them to metric collection backend is a path we should enable.
When running very tiny workloads or while being in limited compute environments, running collection backends is not always an option. It'd be good to be able to parse the metrics to lightweight intermediate components to temporary store, aggregate and export to metrics collection backend is not possible.

Proposal

Let’s create a standalone parser library that will parse the text format to Protocol Buffers. Consumers of OpenMetrics can rely on the parser library if proto exposition is not available. If metrics are exposed in protos, scraper should prefer to fetch them in proto instead of parsing the text format.

Alternatives Considered

Building a standalone parser library based on components from the Prometheus source code. But maintainability-wise, this is not the right approach and it may require forks of unexported components. Forking the Prometheus project should be discouraged.
Making protobuf a mandatory export format but this would limit the adoption of OpenMetrics.

RichiH commented 3 years ago

Datapoint/FYI: Making protobuf mandatory has been discussed, in particular for large-scale deployments, and discarded for precisely that reason. Anyone with the scaling needs will be able to implement proto, most likely on both ends.

That being said, a standard parser makes sense.

I am unsure if this should live in prometheus/ or openobservability/ and can see good reasons for either. As long as it's done in consensus with Prometheus-team, it's just a name in an URL anyway.

juliusv commented 3 years ago

My 2c from the peanut gallery: Sounds good, and I'm for putting it into the OM org vs. the Prometheus org. If you compare it with https://github.com/open-telemetry, OT also has reference client library implementations for various languages, and a similar thing could make sense for OM as well. Then OM could eventually become a home for OM parsers (and serializers) for all kinds of languages.

SuperQ commented 3 years ago

Prometheus provides discovery and scraping libraries to enable non-Prometheus programs to discover and scrape Prometheus endpoints. But the these libraries report data points as deltas and makes the consumer to build a state machine and aggregate the deltas in order to report the collected series.

Can you be more specific about what libraries you're talking about here? Prometheus specifically does not use or store deltas, and discourages the use of deltas in favor of raw data. The Prometheus design is very much centered around collecting and storing raw data without interpretation.

rakyll commented 3 years ago

@SuperQ, I was referring to the scraping library from Prometheus. It reports data points via storage.Appendable, see https://godoc.org/github.com/prometheus/prometheus/scrape#NewManager. The non-Prometheus scrapers end up implementing a state machine to handle the deltas.

SuperQ commented 3 years ago

I'm pretty sure that appends samples, not deltas. In OpenMetrics terms this would be a MetricPoint. This is a raw value, not a delta.

@juliusv, can you answer this question?

juliusv commented 3 years ago

The scraping layer in https://github.com/prometheus/prometheus/blob/master/scrape/scrape.go does a lot of stuff in addition to the core parser to perform Prometheus-specific actions such as staleness tracking, health metrics reporting, and caching of time series IDs for faster subsequent appends. As far as I know, there is no sample value delta tracking anywhere in there, but maybe the time series ID tracking or a similar thing was meant by that. In any case, the scrape layer keeps a lot of state between scrapes to accomplish those things.

However, the core OM parser (not the full scraper) is at https://github.com/prometheus/prometheus/blob/master/pkg/textparse/openmetricsparse.go and AFAIK doesn't require any such state tracking at all. So I hope it would be relatively easily reusable outside of Prometheus. Still, it can make sense to move it to this org IMO. Or fork it here, in case we need a slightly different implementation in Prometheus.

brian-brazil commented 3 years ago

Currently, there isn't an official parser library for OpenMetrics That being said, a standard parser makes sense.

There is already a standard standalone parser for OpenMetrics. https://github.com/prometheus/client_python is the reference implementation for both parsing and exposition, and is already used outside of Prometheus.

It reports data points via storage.Appendable, see https://godoc.org/github.com/prometheus/prometheus/scrape#NewManager.

That's an internal library of Prometheus, and not suitable for general usage as it's is very tightly bound to the Prometheus TSDB for performance reasons. I would not recommend using it as a base for general OpenMetrics parser, among other things it doesn't do the validation and data modelling that an official OpenMetrics parser should.

rakyll commented 3 years ago

does a lot of stuff in addition to the core parser to perform Prometheus-specific actions such as staleness tracking, health metrics reporting, and caching of time series IDs for faster subsequent appends. As far as I know, there is no sample value delta tracking anywhere in there, but maybe the time series ID tracking or a similar thing was meant by that. In any case, the scrape layer keeps a lot of state between scrapes to accomplish those things.

My choice of vocabulary was wrong when calling them deltas. This is the situation I was describing.

brian-brazil commented 3 years ago

My choice of vocabulary was wrong when calling them deltas. This is the situation I was describing.

Yeah, there's lot of specialised logic in there.

For Go I'm currently expecting that we'll end up with a full parser and exposition in https://github.com/prometheus/common/tree/master/expfmt, which is where the official Prometheus text format ones live - Prometheus itself stopped using the official Prometheus text format parser in 2.0.0. Prometheus will want a full parser for promtool.

Java exposition is next on my todo list, but I'd be happy to review a PR to add a full parser into common. We'll probably want to share the validation between the proto and text formats.

SuperQ commented 3 years ago

@rakyll Thanks for the clarifications. I agree, it's a good idea to get some "easy to import" libraries to make interoperability easier.

prometheus / OpenMetrics