Closed rogercoll closed 1 month ago
My initial thought on this: EIMP is not meant to be used in OTEL mode of exporter. The whole point of this processor was to be used along with ecs mode of exporter to do the required translations.
Rest, I would go through the issue in details.
@rogercoll I've tested use of the elasticinframetrics
with the mode: otel
:
exporters:
elasticsearch:
mode: otel
processors:
elasticinframetrics:
add_system_metrics: true
add_k8s_metrics: true
pipelines:
metrics:
receivers: [kubectl, hostmetrics]
processors: [elasticinframetrics]
exporters: [elasticsearch]
(also including the hostmetrics part as system metrics are also needed for the Inventory UI) Outcome:
*.otel
in the end:
-> Inventory UI does not work because of that, because there are expected datastreams name without .otel in the end:
As an example I've checked kubernetes.pod.otel
datastream - there are only transformed metrics, but with metrics.*
prefix
In logs of the collector there are lots of failed to index document
for the otel k8s (and system) metrics:
2024-09-19T15:23:39.791Z error elasticsearchexporter@v0.108.0/bulkindexer.go:332 failed to index document {"kind": "exporter", "data_type": "metrics", "name": "elasticsearch", "index": "metrics-generic.otel-default", "error.type": "document_parsing_exception", "error.reason": "[1:164] Can't find dynamic template for dynamic template name [gauge_long] of field [metrics.k8s.container.restarts]"}
Tested with config:
processors:
elasticinframetrics:
add_system_metrics: true
add_k8s_metrics: true
exporters:
elasticsearch/otel:
mapping:
mode: otel
elasticsearch/ecs:
mapping:
mode: ecs
pipelines:
metrics/ecs:
receivers: [kubectl, hostmetrics]
processors: [elasticinframetrics]
exporters: [elasticsearch/ecs]
metrics/otel:
receivers: [kubectl, hostmetrics]
processors: []
exporters: [elasticsearch/otel]
Note: we need to split pipelines only for daemonset, for deployment we can use elasticsearch/otel
only - https://github.com/rogercoll/opentelemetry/compare/add_onboarding_operator_values...tetianakravchenko:opentelemetry:split-otel-and-ecs-mode?expand=1
Outcome:
It is needed to install assets for system and k8s integration - should be included in onboarding process
Inventory page is relying on metrics stored in 9 datasteams: kubernetes.pod
, system.process
, system.network
, system.filesystem
, system.diskio
, system.cpu
, system.load
, system.memory
, system.process.summary
and inventory page works:
looking closed on kubernetes.pod
:
-> it includes only kubernetes.*
metrics, and relevant metadata (like kubernetes.pod.name
), doc sample:
{
"_index": ".ds-metrics-kubernetes.pod-default-2024.09.24-000001",
"_id": "5S-RrHwEkml4PxvZAAABkiOoBmQ",
"_version": 1,
"_score": 0,
"_source": {
"@timestamp": "2024-09-24T10:51:07.236Z",
"data_stream": {
"dataset": "kubernetes.pod",
"namespace": "default",
"type": "metrics"
},
"event": {
"agent_id_status": "missing",
"dataset": "kubernetes.pod",
"ingested": "2024-09-24T10:51:15Z"
},
"host": {
"architecture": "amd64",
"cpu": {
"cache": {
"l2": {
"size": 16384
}
},
"family": "6",
"model": {
"id": "158",
"name": "Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz"
},
"stepping": "13",
"vendor": {
"id": "GenuineIntel"
}
},
"hostname": "kind-control-plane",
"ip": [
"10.244.0.1",
"172.18.0.4",
"172.21.0.2",
"fc00:f853:ccd:e793::2",
"fe80::42:acff:fe15:2"
],
"mac": [
"02-42-AC-12-00-04",
"02-42-AC-15-00-02",
"02-C2-11-D3-4E-B2",
"0A-60-45-60-6D-9C",
"6A-D0-26-B4-D0-DC",
"D2-10-9B-E6-3C-08",
"D6-10-C1-90-B9-53",
"D6-69-21-E4-EF-11"
],
"name": "kind-control-plane",
"os": {
"full": "Ubuntu 20.04.6 LTS (Focal Fossa) (Linux kind-control-plane 6.6.12-linuxkit #1 SMP PREEMPT_DYNAMIC Fri Jan 19 12:50:23 UTC 2024 x86_64)",
"platform": "linux"
}
},
"kubernetes": {
"namespace": "kube-system",
"pod": {
"cpu": {
"usage": {
"limit": {
"pct": 0
},
"node": {
"pct": 0.002
}
}
},
"memory": {
"usage": {
"limit": {
"pct": 0
},
"node": {
"pct": 0.013
}
}
},
"name": "etcd-kind-control-plane",
"network": {
"rx": {
"bytes": 991060529
},
"tx": {
"bytes": 23413379
}
},
"uid": "2772f6e21146f2e8a331b1cc7d319cf1"
}
},
"otel_remapped": true,
"service": {
"type": "kubernetes"
}
},
"fields": {
"host.os.full.text": [
"Ubuntu 20.04.6 LTS (Focal Fossa) (Linux kind-control-plane 6.6.12-linuxkit #1 SMP PREEMPT_DYNAMIC Fri Jan 19 12:50:23 UTC 2024 x86_64)"
],
"host.os.full": [
"Ubuntu 20.04.6 LTS (Focal Fossa) (Linux kind-control-plane 6.6.12-linuxkit #1 SMP PREEMPT_DYNAMIC Fri Jan 19 12:50:23 UTC 2024 x86_64)"
],
"host.cpu.family": [
"6"
],
"kubernetes.pod.cpu.usage.limit.pct": [
0
],
"host.hostname": [
"kind-control-plane"
],
"kubernetes.pod.uid": [
"2772f6e21146f2e8a331b1cc7d319cf1"
],
"host.cpu.model.name": [
"Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz"
],
"host.mac": [
"02-42-AC-12-00-04",
"02-42-AC-15-00-02",
"02-C2-11-D3-4E-B2",
"0A-60-45-60-6D-9C",
"6A-D0-26-B4-D0-DC",
"D2-10-9B-E6-3C-08",
"D6-10-C1-90-B9-53",
"D6-69-21-E4-EF-11"
],
"service.type": [
"kubernetes"
],
"host.ip": [
"10.244.0.1",
"172.18.0.4",
"172.21.0.2",
"fc00:f853:ccd:e793::2",
"fe80::42:acff:fe15:2"
],
"host.cpu.model.name.text": [
"Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz"
],
"kubernetes.namespace": [
"kube-system"
],
"kubernetes.pod.network.rx.bytes": [
991060529
],
"kubernetes.pod.network.tx.bytes": [
23413379
],
"kubernetes.pod.name": [
"etcd-kind-control-plane"
],
"host.name": [
"kind-control-plane"
],
"event.agent_id_status": [
"missing"
],
"host.cpu.model.id": [
"158"
],
"host.cpu.cache.l2.size": [
16384
],
"data_stream.namespace": [
"default"
],
"host.cpu.stepping": [
"13"
],
"kubernetes.pod.memory.usage.node.pct": [
0.013
],
"otel_remapped": [
true
],
"data_stream.type": [
"metrics"
],
"host.cpu.vendor.id": [
"GenuineIntel"
],
"host.architecture": [
"amd64"
],
"kubernetes.pod.cpu.usage.node.pct": [
0.002
],
"event.ingested": [
"2024-09-24T10:51:15.000Z"
],
"@timestamp": [
"2024-09-24T10:51:07.236Z"
],
"host.os.platform": [
"linux"
],
"data_stream.dataset": [
"kubernetes.pod"
],
"event.dataset": [
"kubernetes.pod"
],
"kubernetes.pod.memory.usage.limit.pct": [
0
]
}
}
metrics coming from mode: ecs
are stored in generic
datastream
-> includes only k8s.*
metrics (not metrics.k8s.*) and transformed metadata:
metrics coming from mode: otel
are stored in generic.otel
datastream. generic
and generic.otel
data is not overlapping.
cc @AlexanderWert
Just to double-down on previous comment, was testing in code today and I see that the remapper just creates another document with tranformed kubernetes.* metrics and nothing else (see test here where the k8s.pod.test
wont be available in final document)
The generic
datastream can be removed/ dropped ! It contains the kubelet related metrics that come from mode:ecs pipeline.
Same copy of metrics is present in generic.otel
Note: The only available option to implement the drop of rest of metrics I think can be not to return the mb object here. The remapper would still have taken place in lines above.
we need to split pipelines only for daemonset
My main concern is regarding metrics duplication, at the moment if we configure the kubeletstats
+ elasticinframetrics
we end up with the same metrics but with different names (no matter the elasticsearch exporter mode): k8s.*
and kubernetes.*
.
These are the metrics that will be ingested with the following configuration:
pipelines:
metrics/ecs:
receivers: [kubectl, hostmetrics]
processors: [elasticinframetrics] ---> `k8s.*`, `kubernetes.*`, `system.*` and `system in ecs format`
exporters: [elasticsearch/ecs]
metrics/otel:
receivers: [kubectl, hostmetrics]
processors: [] ---> `k8s.*`, `system.*`
exporters: [elasticsearch/otel]
Note that the k8s.*
and the system.*
metrics will be duplicated but exported with different modes. @tetianakravchenko @gizas is this the expected behavior? Which metrics do we need for the inventory?
If we only need ECS metrics, I think the elasticinframetrics
should drop the otel metrics and just produce the ECS ones:
pipelines:
metrics/ecs:
receivers: [kubectl, hostmetrics]
processors: [elasticinframetrics] ---> `kubernetes.*` and `system in ecs format`
exporters: [elasticsearch/ecs]
metrics/otel:
receivers: [kubectl, hostmetrics]
processors: [] ---> `k8s.*`, `system.*`
exporters: [elasticsearch/otel]
@rogercoll Just one minor correction: Since we don't have any OTel-data native system assets, yet. We don't need to include the hostmetrics
receiver in the metrics/otel
pipeline, right?
Note that the k8s. and the system. metrics will be duplicated but exported with different modes. @tetianakravchenko @gizas is this the expected behavior? Which metrics do we need for the inventory?
The elasticinframetrics processor will do a remapping and will create kubernetes.pod, system.process, system.network, system.filesystem, system.diskio, system.cpu, system.load, system.memory, system.process.summary
. Those are additonal datastreams that the inventory relies on. Tania explains this here
If we only need ECS metrics, I think the elasticinframetrics should drop the otel metrics and just produce the ECS ones:
To be more precise on this, the processor also keeps the k8s. metrics and adds the new one. So it needs not to add the k8s. metrics I am trying to build my image locally to test the "dropping" (as per note here https://github.com/elastic/opentelemetry-lib/issues/97#issuecomment-2371193078)
Is the main focus of this PR to drop the OTEL native Metrics with override ?
Is the main focus of this PR to drop the OTEL native Metrics with override ?
yes
The current remappers do not override the processed metrics, but they insert new metrics. In that sense, we end up having duplicated metric values but with different name. For example,
k8s.pod.cpu_limit_utilization
vskubernetes.pod.cpu.usage.limit.pct
.This hasn’t been an issue so far because our primary focus has been on the ecs format, with metrics being sent by the Elasticsearch exporter configured in ecs mode. However, as we begin transitioning to the native Otel mode, we now face the challenge of having to support both metrics formats in Kibana:
The problem when not overriding the current metrics is that both metrics formats will be forwarded to the same
elasticsearch
exporter, and depending on its configuration they will be formatted: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/elasticsearchexporter#elasticsearch-document-mappingLet's take this configuration:
As the elasticsearch exporter is configured with ecs mode, all metrics (native and added ones) processed by the EIMP processor will be formatted.
It would be great to have an option in the remappers, so processors override the metrics instead of inserting them. This is the pipeline we have in mind: