[prometheus][remote_write] Metrics are not grouped by labels

tetianakravchenko commented 1 year ago

For remote_write it is not possible to define a list of dimenstions, that would prevent documents duplications and dropping documents in the end (when enabling tsdb). All metrics are not grouped by the unique list of labels, as it is for collector datastream.

Example:

Note for the same timestamp Aug 24, 2023 @ 16:58:06.491 there are 5 documents with the same set of labels (prometheus.labels_id is a fingerprint of the prometheus.labels), for some of them ingestion time is different, but mainly even the event.ingested is the same:

prometheus.labels_id is a fingerprint of the prometheus.labels object (this approach is used for the collector datastream):

processors:
  - fingerprint:
      fields: ["prometheus.labels"]
      target_field: "prometheus.labels_id"
      ignore_failure: true
      ignore_missing: true

document sample 1:

``` { "_index": ".ds-metrics-prometheus.remote_write-default-2023.08.24-000001", "_id": "qAINKIoBmUHtrs2qn6Rd", "_version": 1, "_score": 0, "_source": { "agent": { "name": "test-worker", "id": "f0909073-8c8a-4d85-9145-6b188223cd53", "ephemeral_id": "bfeba8e3-418d-4b6d-9664-6100ad624665", "type": "metricbeat", "version": "8.9.0" }, "@timestamp": "2023-08-24T14:58:06.491Z", "ecs": { "version": "8.0.0" }, "service": { "type": "prometheus" }, "data_stream": { "namespace": "default", "type": "metrics", "dataset": "prometheus.remote_write" }, "elastic_agent": { "id": "f0909073-8c8a-4d85-9145-6b188223cd53", "version": "8.9.0", "snapshot": false }, "host": { "hostname": "test-worker", "os": { "kernel": "5.15.49-linuxkit-pr", "codename": "focal", "name": "Ubuntu", "family": "debian", "type": "linux", "version": "20.04.6 LTS (Focal Fossa)", "platform": "ubuntu" }, "containerized": false, "ip": [ "10.244.2.1", "172.21.0.3", "fc00:f853:ccd:e793::3", "fe80::42:acff:fe15:3", "172.18.0.9" ], "name": "test-worker", "id": "b3859717435c463f999f01f0c5f6fd7b", "mac": [ "02-42-AC-12-00-09", "02-42-AC-15-00-03", "9A-DC-74-DB-AA-60" ], "architecture": "x86_64" }, "metricset": { "name": "remote_write" }, "prometheus": { "container_memory_cache": { "value": 57344 }, "container_memory_max_usage_bytes": { "value": 0 }, "container_memory_failcnt": { "value": 0 }, "container_memory_mapped_file": { "value": 0 }, "labels": { "image": "registry.k8s.io/pause:3.7", "instance": "test-control-plane", "pod": "etcd-test-control-plane", "kubernetes_io_arch": "amd64", "beta_kubernetes_io_os": "linux", "kubernetes_io_hostname": "test-control-plane", "name": "5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8", "namespace": "kube-system", "beta_kubernetes_io_arch": "amd64", "id": "/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-podb38d0166c295df08424057d4e86c6811.slice/cri-containerd-5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8.scope", "kubernetes_io_os": "linux", "job": "kubernetes-nodes-cadvisor" }, "labels_id": "+prrYyr9fsKNohYfFT37AeXard0=" }, "event": { "agent_id_status": "verified", "ingested": "2023-08-24T14:58:16Z", "module": "prometheus", "dataset": "prometheus.remote_write" } }, "fields": { "prometheus.container_memory_cache.value": [ 57344 ], "elastic_agent.version": [ "8.9.0" ], "host.os.name.text": [ "Ubuntu" ], "prometheus.labels.namespace": [ "kube-system" ], "host.hostname": [ "test-worker" ], "prometheus.container_memory_failcnt.value": [ 0 ], "host.mac": [ "02-42-AC-12-00-09", "02-42-AC-15-00-03", "9A-DC-74-DB-AA-60" ], "service.type": [ "prometheus" ], "host.os.version": [ "20.04.6 LTS (Focal Fossa)" ], "host.os.name": [ "Ubuntu" ], "agent.name": [ "test-worker" ], "host.name": [ "test-worker" ], "event.agent_id_status": [ "verified" ], "prometheus.labels.id": [ "/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-podb38d0166c295df08424057d4e86c6811.slice/cri-containerd-5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8.scope" ], "host.os.type": [ "linux" ], "data_stream.type": [ "metrics" ], "prometheus.labels_id": [ "+prrYyr9fsKNohYfFT37AeXard0=" ], "host.architecture": [ "x86_64" ], "agent.id": [ "f0909073-8c8a-4d85-9145-6b188223cd53" ], "ecs.version": [ "8.0.0" ], "host.containerized": [ false ], "agent.version": [ "8.9.0" ], "host.os.family": [ "debian" ], "prometheus.labels.kubernetes_io_arch": [ "amd64" ], "prometheus.container_memory_max_usage_bytes.value": [ 0 ], "prometheus.labels.beta_kubernetes_io_os": [ "linux" ], "host.ip": [ "10.244.2.1", "172.21.0.3", "fc00:f853:ccd:e793::3", "fe80::42:acff:fe15:3", "172.18.0.9" ], "prometheus.labels.name": [ "5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8" ], "agent.type": [ "metricbeat" ], "event.module": [ "prometheus" ], "prometheus.labels.kubernetes_io_hostname": [ "test-control-plane" ], "host.os.kernel": [ "5.15.49-linuxkit-pr" ], "prometheus.labels.beta_kubernetes_io_arch": [ "amd64" ], "elastic_agent.snapshot": [ false ], "prometheus.labels.image": [ "registry.k8s.io/pause:3.7" ], "host.id": [ "b3859717435c463f999f01f0c5f6fd7b" ], "elastic_agent.id": [ "f0909073-8c8a-4d85-9145-6b188223cd53" ], "data_stream.namespace": [ "default" ], "host.os.codename": [ "focal" ], "prometheus.labels.kubernetes_io_os": [ "linux" ], "metricset.name": [ "remote_write" ], "prometheus.labels.instance": [ "test-control-plane" ], "prometheus.labels.pod": [ "etcd-test-control-plane" ], "event.ingested": [ "2023-08-24T14:58:16.000Z" ], "@timestamp": [ "2023-08-24T14:58:06.491Z" ], "prometheus.container_memory_mapped_file.value": [ 0 ], "host.os.platform": [ "ubuntu" ], "data_stream.dataset": [ "prometheus.remote_write" ], "agent.ephemeral_id": [ "bfeba8e3-418d-4b6d-9664-6100ad624665" ], "prometheus.labels.job": [ "kubernetes-nodes-cadvisor" ], "event.dataset": [ "prometheus.remote_write" ] } } ```

Document sample 2:

``` { "_index": ".ds-metrics-prometheus.remote_write-default-2023.08.24-000001", "_id": "ogINKIoBmUHtrs2qn6XR", "_version": 1, "_score": 0, "_source": { "agent": { "name": "test-worker", "id": "f0909073-8c8a-4d85-9145-6b188223cd53", "type": "metricbeat", "ephemeral_id": "bfeba8e3-418d-4b6d-9664-6100ad624665", "version": "8.9.0" }, "@timestamp": "2023-08-24T14:58:06.491Z", "ecs": { "version": "8.0.0" }, "service": { "type": "prometheus" }, "data_stream": { "namespace": "default", "type": "metrics", "dataset": "prometheus.remote_write" }, "elastic_agent": { "id": "f0909073-8c8a-4d85-9145-6b188223cd53", "version": "8.9.0", "snapshot": false }, "host": { "hostname": "test-worker", "os": { "kernel": "5.15.49-linuxkit-pr", "codename": "focal", "name": "Ubuntu", "family": "debian", "type": "linux", "version": "20.04.6 LTS (Focal Fossa)", "platform": "ubuntu" }, "containerized": false, "ip": [ "10.244.2.1", "172.21.0.3", "fc00:f853:ccd:e793::3", "fe80::42:acff:fe15:3", "172.18.0.9" ], "name": "test-worker", "id": "b3859717435c463f999f01f0c5f6fd7b", "mac": [ "02-42-AC-12-00-09", "02-42-AC-15-00-03", "9A-DC-74-DB-AA-60" ], "architecture": "x86_64" }, "metricset": { "name": "remote_write" }, "prometheus": { "container_memory_swap": { "value": 0 }, "container_memory_rss": { "value": 45056 }, "container_memory_working_set_bytes": { "value": 249856 }, "container_memory_usage_bytes": { "value": 262144 }, "labels": { "image": "registry.k8s.io/pause:3.7", "instance": "test-control-plane", "pod": "etcd-test-control-plane", "kubernetes_io_arch": "amd64", "namespace": "kube-system", "kubernetes_io_hostname": "test-control-plane", "beta_kubernetes_io_os": "linux", "name": "5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8", "kubernetes_io_os": "linux", "id": "/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-podb38d0166c295df08424057d4e86c6811.slice/cri-containerd-5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8.scope", "beta_kubernetes_io_arch": "amd64", "job": "kubernetes-nodes-cadvisor" }, "labels_id": "+prrYyr9fsKNohYfFT37AeXard0=" }, "event": { "agent_id_status": "verified", "ingested": "2023-08-24T14:58:16Z", "module": "prometheus", "dataset": "prometheus.remote_write" } }, "fields": { "elastic_agent.version": [ "8.9.0" ], "host.os.name.text": [ "Ubuntu" ], "prometheus.labels.namespace": [ "kube-system" ], "prometheus.container_memory_swap.value": [ 0 ], "host.hostname": [ "test-worker" ], "host.mac": [ "02-42-AC-12-00-09", "02-42-AC-15-00-03", "9A-DC-74-DB-AA-60" ], "service.type": [ "prometheus" ], "host.os.version": [ "20.04.6 LTS (Focal Fossa)" ], "host.os.name": [ "Ubuntu" ], "agent.name": [ "test-worker" ], "host.name": [ "test-worker" ], "event.agent_id_status": [ "verified" ], "prometheus.labels.id": [ "/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-podb38d0166c295df08424057d4e86c6811.slice/cri-containerd-5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8.scope" ], "host.os.type": [ "linux" ], "data_stream.type": [ "metrics" ], "prometheus.labels_id": [ "+prrYyr9fsKNohYfFT37AeXard0=" ], "host.architecture": [ "x86_64" ], "agent.id": [ "f0909073-8c8a-4d85-9145-6b188223cd53" ], "ecs.version": [ "8.0.0" ], "host.containerized": [ false ], "prometheus.container_memory_rss.value": [ 45056 ], "agent.version": [ "8.9.0" ], "host.os.family": [ "debian" ], "prometheus.labels.kubernetes_io_arch": [ "amd64" ], "prometheus.container_memory_working_set_bytes.value": [ 249856 ], "prometheus.labels.beta_kubernetes_io_os": [ "linux" ], "host.ip": [ "10.244.2.1", "172.21.0.3", "fc00:f853:ccd:e793::3", "fe80::42:acff:fe15:3", "172.18.0.9" ], "prometheus.labels.name": [ "5187a1daefe573656f04ab113bcc784772045d234bb4de0bb8e1ad29b4c578d8" ], "agent.type": [ "metricbeat" ], "event.module": [ "prometheus" ], "prometheus.labels.kubernetes_io_hostname": [ "test-control-plane" ], "host.os.kernel": [ "5.15.49-linuxkit-pr" ], "prometheus.labels.beta_kubernetes_io_arch": [ "amd64" ], "elastic_agent.snapshot": [ false ], "prometheus.labels.image": [ "registry.k8s.io/pause:3.7" ], "host.id": [ "b3859717435c463f999f01f0c5f6fd7b" ], "elastic_agent.id": [ "f0909073-8c8a-4d85-9145-6b188223cd53" ], "data_stream.namespace": [ "default" ], "host.os.codename": [ "focal" ], "prometheus.labels.kubernetes_io_os": [ "linux" ], "metricset.name": [ "remote_write" ], "prometheus.labels.instance": [ "test-control-plane" ], "prometheus.labels.pod": [ "etcd-test-control-plane" ], "event.ingested": [ "2023-08-24T14:58:16.000Z" ], "@timestamp": [ "2023-08-24T14:58:06.491Z" ], "host.os.platform": [ "ubuntu" ], "data_stream.dataset": [ "prometheus.remote_write" ], "agent.ephemeral_id": [ "bfeba8e3-418d-4b6d-9664-6100ad624665" ], "prometheus.labels.job": [ "kubernetes-nodes-cadvisor" ], "prometheus.container_memory_usage_bytes.value": [ 262144 ], "event.dataset": [ "prometheus.remote_write" ] } } ```

felixbarny commented 1 year ago

Note for the same timestamp Aug 24, 2023 @ 16:58:06.491 there are 5 documents with the same set of labels (prometheus.labels_id is a fingerprint of the prometheus.labels), for some of them ingestion time is different, but mainly even the event.ingested is the same:

What's the reason for the seemingly duplicated metrics? Are they actual duplicates? They seem to come from the same pod id and the same time, how does that happen? Or is that missing a dimension? But which dimension is missing, if any? What would be the impact of dropping duplicate metrics of that sort?

felixbarny commented 1 year ago

Are the metrics coming from the same prometheus instance? If so, may the remote write configuration be faulty so that it writes multiple times, for example, because the same remote write endpoint is configured multiple times?

tetianakravchenko commented 1 year ago

Note for the same timestamp Aug 24, 2023 @ 16:58:06.491 there are 5 documents with the same set of labels (prometheus.labels_id is a fingerprint of the prometheus.labels), for some of them ingestion time is different, but mainly even the event.ingested is the same:

What's the reason for the seemingly duplicated metrics?

I think it depends on max_samples_per_send (default: 500) and the fact that prometheus send all metrics not in 1 batch, metricbeat in its turn performs grouping per batch. I've run this test: prometheus configuration - scrape only prometheus server metrics:

scrape_configs:
      - job_name: prometheus
        static_configs:
          - targets:
            - localhost:9090

total amount of metrics - 630:

root@test-worker2:/usr/share/elastic-agent# curl -s prometheus-server-server.default:80/metrics | grep -v ^# | wc -l
630

when running TSDB-migration-test-kit - I see that some documents are overlapping if using the prometheus.labels_id as a main dimension, that suppose to distinguish documents (similar to collector datastream)

if setting:

remoteWrite:
    - url: http://elastic-agent.kube-system:9201/write
      queue_config:
        max_samples_per_send: 1000 (higher 630)

labels_id can be used as a dimension - there are no overlappings.

Are they actual duplicates?

No, it is not correct name for this issue, I've renames issues to Metrics are not grouped by labels

They seem to come from the same pod id and the same time, how does that happen? Or is that missing a dimension? But which dimension is missing, if any?

the first question I believe is covered by the test above. Yes, we are missing some dimension in this case.

I am trying to investigate this approach:

try to extract list of metrics as an object and add it under prometheus.labels

use fingerprint to calculate prometheus.labels_id that includes prometheus.labels.metric_names

prometheus: {
"name1": {
    "counter": <val>
},
    "name2": {
           "value": <val>
    }
"labels": {
    metric_names: [
        "name1",
        "name2"
    ]
},
"labels_id": <fingerprint>
}

It might be not a perfect solution, as fingerprint might change when number of scraping endpoints/metrics will change.

Other option would be to update already published document with the specific timestamp that has the same set of labels/labels_id fingerprint.

What would be the impact of dropping duplicate metrics of that sort?

it would imply data loss.

Are the metrics coming from the same prometheus instance? If so, may the remote write configuration be faulty so that it writes multiple times, for example, because the same remote write endpoint is configured multiple times?

Yes, data is coming from the same prometheus. Configuration:

prometheus.yml: |
    global:
      evaluation_interval: 1m
      scrape_interval: 1m
      scrape_timeout: 10s
    remote_write:
      url: http://elastic-agent.kube-system:9201/write
    rule_files:
    - /etc/config/recording_rules.yml
    - /etc/config/alerting_rules.yml
    - /etc/config/rules
    - /etc/config/alerts
    scrape_configs:
    - job_name: prometheus
      static_configs:
      - targets:
        - localhost:9090

felixbarny commented 1 year ago

I got it now, thanks for the explanations and the detailed analysis!

The underlying issue here is that metric names are not part of the _tsid. I thought this was mostly a non-issue as all metrics for the same _tsid are usually in the same document. You've found a good example where this isn't the case.

use fingerprint to calculate prometheus.labels_id that includes prometheus.labels.metric_names

I think this is a good short-term workaround. But maybe I'd slightly change the approach. Instead of adding metric_names to labels, this could be top-level and marked as a dimension instead of being part of the labels fingerprint. Ultimately, it doesn't matter too much.

The more interesting discussing is that I think TSDB should add metric names to _tsid. That's because the name of a metric is part of the identity of a metric. See also the definition of a time series, according to the OpenTelemetry metrics data model: https://opentelemetry.io/docs/specs/otel/metrics/data-model/#timeseries-model

cc @martijnvg

tetianakravchenko commented 1 year ago

Instead of adding metric_names to labels, this could be top-level and marked as a dimension instead of being part of the labels fingerprint.

In the end both fingerprints are needed - labels fingerprint and metric name fingerprint must be defined as a dimensions. My motivation was

to have only 1 fingerprint as a dimension to rely on instead of 2, so store only 1 extra field,
and the fact that the name can be considered as a label for prometheus metrics (metric_name{label_name=X} == {__name__=metric_name, label_name=X})

Could you please explain why it should be added on the top-level, instead of being part of the labels fingerprint?

I am also not planning to store the metrics_names as it is a redundant information and can impact the documents size, I am planning to use smth like - https://github.com/elastic/integrations/pull/7565/files#diff-03b3cb0809132fbdf6119d02478854a135b678fb0e2db1d689cf6b44804daba1R2-R19 (not sure yet 2 vs 1 fingerprints)

felixbarny commented 1 year ago

+1 on on not storing metric_names, just the fingerprint.

Could you please explain why it should be added on the top-level, instead of being part of the labels fingerprint?

I don't have a strong opinion here and I don't think it matters too much. I was just my first instinct to not store the fingerprint under labels.* to avoid field suggestions for that fingerprint field when someone types in labels.. Again, not something to worry about too much. We should consider these fields to be an implementation detail that we can always change without it being a breaking change.

tetianakravchenko commented 1 year ago

question: if we have the same metrics_name object [<a>, <b>, <c>] vs [<c>, <b>, <a>] will the fingerprint be the same?

felixbarny commented 1 year ago

No, it won't. In order to assure that, you'll need to sort the array first. The same applies to the labels fingerprint btw.

tetianakravchenko commented 1 year ago

No, it won't. In order to assure that, you'll need to sort the array first. The same applies to the labels fingerprint btw.

thank you for the reply! i am not sure that for labels it applies. If I correctly understand fingerprint implementation https://github.com/elastic/elasticsearch/blob/main/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/FingerprintProcessor.java#L110-L122 - map should be sorted and processed in a consistent order

labels object looks like this:

prometheus: {
    "name1": {
        "counter": <val>
    },
        "name2": {
               "value": <val>
        }
    "labels": {
        metric_names: [
            "name1",
            "name2"
        ],
                "key1": "value1"
    },
    "labels_fingerprint": <fingerprint>
}

felixbarny commented 1 year ago

Ah, I didn't know that the fingerprint processor is already sorting the values! Looks good then.

elastic / integrations

[prometheus][remote_write] Metrics are not grouped by labels #7533