grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.43k stars 210 forks source link

Manually stored metrics do not have metadata #547

Open rfratto opened 2 years ago

rfratto commented 2 years ago

Samples which are manually written to the WAL (i.e., written outside of the normal scraping process of a metrics instance) do not support metadata being sent over remote write.

Metadata is used to display metric type and help info from Prometheus query frontends like Grafana.

Remote write currently expects that metadata comes from an instance of scrape.Manager, which is not always the case for code relying on remote write to send samples.

This impacts at least the following:

This went mostly unnoticed because most metrics still have metadata. I first became aware of this when prototyping grafana/agent#1261, where scraping and remote_write was initially fully decoupled, causing all metadata to disappear.

It's not immediately obvious what the best solution here is. Some way of being able to customize what metadata gets collected by remote_write would be ideal though.

rfratto commented 2 years ago

(cc @mapno for the heads up wrt the spanmetrics part)

rfratto commented 2 years ago

There was a Prometheus PR (prometheus/prometheus#7771) that adds metadata to the WAL. From what I understand, having metadata stored in the WAL is still of interest to the Prometheus team (or at least to @cstyan) and would resolve this issue.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed in 7 days if there is no new activity. Thank you for your contributions!

srclosson commented 1 year ago

Is the referenced issue in prometheus (https://github.com/prometheus/prometheus/pull/7771) also something in Mimir as well? As in, will Mimir also need a similar fix?

rfratto commented 1 year ago

As in, will Mimir also need a similar fix?

No, this is only relevant for the agent.

bboreham commented 1 year ago

Much of that prometheus/prometheus#7771 was merged as prometheus/prometheus#10312; there is another PR prometheus/prometheus#11640 which might complete the job.

doanbutar commented 10 months ago

Hi @rfratto, I am using the static mode with node_exporter integration (not integrations-next), but I can't see the description and type of the metric. Is there something I missed in the config?

I am sending metrics from grafana agent to my grafana cloud instance.

Screenshot 2024-01-11 at 09 12 12

Agent version: 0.38.1 Agent config:

server:
  log_level: debug

metrics:
  global:
    scrape_interval: 1m
    remote_write:
      - basic_auth:
          password: xxxx
          username: xxxx
        url: xxxx
  configs:
    - name: default
      scrape_configs:
        - job_name: agent
          static_configs:
            - targets: ['127.0.0.1:12345']

integrations:
  node_exporter:
    enabled: true
krajorama commented 9 months ago

This is an issue for supporting native histograms related features in UI frontends, e.g. https://github.com/grafana/grafana/issues/81971 . Unless we can get Prometheus to introduce some new endpoint to share the sample type (as opposed to metric type), which has not happened since we started native histograms 2 years ago .

tpaschalis commented 9 months ago

To add some context around the upstream status of this issue; we've had support for recording Metadata as WAL records and storing them from the scrape loop for a while now. The final piece, the way we'd communicate this over the remote write was still a point of contention.

The original design doc was agreed upon, but to achieve more efficient metadata delivery, it was decided to wait and group this feature together with an interning table implementation that was planned on remote write v2.0.

Currently metadata-over the remote write is already implemented on the remote-write-2.0 branch so we're waiting for the feature to be completed to solve this issue on the Agent side as well.

rfratto commented 7 months ago

Hi there :wave:

On April 9, 2024, Grafana Labs announced Grafana Alloy, the spirital successor to Grafana Agent and the final form of Grafana Agent flow mode. As a result, Grafana Agent has been deprecated and will only be receiving bug and security fixes until its end-of-life around November 1, 2025.

To make things easier for maintainers, we're in the process of migrating all issues tagged variant/flow to the Grafana Alloy repository to have a single home for tracking issues. This issue is likely something we'll want to address in both Grafana Alloy and Grafana Agent, so just because it's being moved doesn't mean we won't address the issue in Grafana Agent :)

srclosson commented 6 months ago

Just ran into this in another POV. Would love to add a +1 to this issue