Closed dfsdevops closed 3 months ago
Hey there, @dfsdevops! I'm here to assist you with any questions or concerns you have. Let's work together to resolve this issue.
To ensure that Kubernetes attributes such as k8s.node.name
are associated with the metrics collected by the hostmetrics
receiver, you need to make sure that the k8sattributes
processor is correctly configured and included in the metrics pipeline.
Here are a few things to check and adjust in your configuration:
Ensure the k8sattributes
processor is configured correctly:
Make sure the k8sattributes
processor is enabled and configured to extract the necessary attributes.
Include the k8sattributes
processor in the metrics pipeline:
Ensure that the k8sattributes
processor is included in the metrics pipeline so that it can process the metrics collected by the hostmetrics
receiver.
Here is an updated version of your configuration with these adjustments:
mode: daemonset
fullnameOverride: otel-collector-agent
image:
repository: "otel/opentelemetry-collector-k8s"
# Required to use the kubeletstats cpu/memory utilization metrics
clusterRole:
create: true
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
verbs:
- get
extraEnvs:
- name: HONEYCOMB_API_KEY
valueFrom:
secretKeyRef:
name: honeycomb
key: api-key
presets:
hostMetrics:
enabled: true
kubernetesAttributes:
enabled: true
extractAllPodLabels: true
extractAllPodAnnotations: true
kubeletMetrics:
enabled: true
logsCollection:
enabled: true
config:
receivers:
jaeger: null
zipkin: null
hostmetrics:
collection_interval: 30s
scrapers:
load:
cpu_average: true
kubeletstats:
collection_interval: 30s
metric_groups:
- pod
- node
- volume
extra_metadata_labels:
- k8s.volume.type
metrics:
k8s.node.uptime:
enabled: true
k8s.pod.uptime:
enabled: true
k8s.pod.cpu_limit_utilization:
enabled: true
k8s.pod.cpu_request_utilization:
enabled: true
k8s.pod.memory_limit_utilization:
enabled: true
k8s.pod.memory_request_utilization:
enabled: true
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 300s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
exporters:
debug:
verbosity: detailed
sampling_initial: 5
sampling_thereafter: 200
otlp:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${env:HONEYCOMB_API_KEY}"
otlp/k8s-metrics:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${env:HONEYCOMB_API_KEY}"
"x-honeycomb-dataset": "k8s-metrics"
otlp/k8s-logs:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${env:HONEYCOMB_API_KEY}"
"x-honeycomb-dataset": "k8s-logs"
processors:
filter/ottl:
error_mode: ignore
metrics:
datapoint:
- 'resource.attributes["k8s.volume.type"] != nil and resource.attributes["k8s.volume.type"] != "persistentVolumeClaim"'
k8sattributes:
passthrough: false
extract:
metadata:
- k8s.node.name
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp]
metrics:
receivers: [hostmetrics, kubeletstats]
processors: [memory_limiter, batch, filter/ottl, k8sattributes]
exporters: [otlp/k8s-metrics]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlp/k8s-logs]
ports:
jaeger-compact:
enabled: false
jaeger-thrift:
enabled: false
jaeger-grpc:
enabled: false
zipkin:
enabled: false
This configuration ensures that the k8sattributes
processor is included in the metrics pipeline and is set to extract the k8s.node.name
attribute. This should help in associating the node name with the metrics collected by the hostmetrics
receiver.
To continue talking to Dosu, mention @dosu.
@dosu thats a good thought, but the chart already generates a configuration that includes the k8sattributes processor in the metrics pipeline
# Source: opentelemetry-collector/templates/configmap-agent.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-agent-agent
namespace: otel
labels:
helm.sh/chart: opentelemetry-collector-0.102.0
app.kubernetes.io/name: opentelemetry-collector
app.kubernetes.io/instance: release-name
app.kubernetes.io/version: "0.107.0"
app.kubernetes.io/managed-by: Helm
data:
relay: |
exporters:
debug:
sampling_initial: 5
sampling_thereafter: 200
verbosity: detailed
otlp:
endpoint: api.honeycomb.io:443
headers:
x-honeycomb-team: ${env:HONEYCOMB_API_KEY}
otlp/k8s-logs:
endpoint: api.honeycomb.io:443
headers:
x-honeycomb-dataset: k8s-logs
x-honeycomb-team: ${env:HONEYCOMB_API_KEY}
otlp/k8s-metrics:
endpoint: api.honeycomb.io:443
headers:
x-honeycomb-dataset: k8s-metrics
x-honeycomb-team: ${env:HONEYCOMB_API_KEY}
extensions:
health_check:
endpoint: ${env:MY_POD_IP}:13133
processors:
batch: {}
filter/ottl:
error_mode: ignore
metrics:
datapoint:
- resource.attributes["k8s.volume.type"] != nil and resource.attributes["k8s.volume.type"]
!= "persistentVolumeClaim"
k8sattributes:
extract:
annotations:
- from: pod
key_regex: (.*)
tag_name: $$1
labels:
- from: pod
key_regex: (.*)
tag_name: $$1
metadata:
- k8s.namespace.name
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.job.name
- k8s.node.name
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
receivers:
filelog:
exclude:
- /var/log/pods/otel_otel-collector-agent*_*/opentelemetry-collector/*.log
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
max_log_size: 102400
type: container
retry_on_failure:
enabled: true
start_at: end
hostmetrics:
collection_interval: 30s
root_path: /hostfs
scrapers:
cpu: null
disk: null
filesystem:
exclude_fs_types:
fs_types:
- autofs
- binfmt_misc
- bpf
- cgroup2
- configfs
- debugfs
- devpts
- devtmpfs
- fusectl
- hugetlbfs
- iso9660
- mqueue
- nsfs
- overlay
- proc
- procfs
- pstore
- rpc_pipefs
- securityfs
- selinuxfs
- squashfs
- sysfs
- tracefs
match_type: strict
exclude_mount_points:
match_type: regexp
mount_points:
- /dev/*
- /proc/*
- /sys/*
- /run/k3s/containerd/*
- /var/lib/docker/*
- /var/lib/kubelet/*
- /snap/*
load:
cpu_average: true
memory: null
network: null
kubeletstats:
auth_type: serviceAccount
collection_interval: 30s
endpoint: ${env:K8S_NODE_NAME}:10250
extra_metadata_labels:
- k8s.volume.type
metric_groups:
- pod
- node
- volume
metrics:
k8s.node.uptime:
enabled: true
k8s.pod.cpu_limit_utilization:
enabled: true
k8s.pod.cpu_request_utilization:
enabled: true
k8s.pod.memory_limit_utilization:
enabled: true
k8s.pod.memory_request_utilization:
enabled: true
k8s.pod.uptime:
enabled: true
otlp:
protocols:
grpc:
endpoint: ${env:MY_POD_IP}:4317
http:
endpoint: ${env:MY_POD_IP}:4318
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 300s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
service:
extensions:
- health_check
pipelines:
logs:
exporters:
- otlp/k8s-logs
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
- filelog
metrics:
exporters:
- otlp/k8s-metrics
processors:
- memory_limiter
- batch
- filter/ottl
- k8sattributes
receivers:
- otlp
- prometheus
- hostmetrics
- kubeletstats
traces:
exporters:
- otlp
processors:
- k8sattributes
- memory_limiter
- batch
receivers:
- otlp
telemetry:
metrics:
address: ${env:MY_POD_IP}:8888
To ensure that Kubernetes attributes such as k8s.node.name
are associated with the metrics collected by the hostmetrics
receiver, you need to verify that the environment variables are correctly set in your deployment configuration. Specifically, the K8S_NODE_NAME
environment variable should be set to provide the necessary context for the k8sattributes
processor.
Here is an example of how to set the environment variables in your deployment configuration:
extraEnvs:
- name: HONEYCOMB_API_KEY
valueFrom:
secretKeyRef:
name: honeycomb
key: api-key
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: K8S_NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: K8S_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: OTEL_RESOURCE_ATTRIBUTES
value: "k8s.cluster.name=demo"
Ensure that these environment variables are included in your deployment configuration to provide the necessary context for the k8sattributes
processor. This should help in associating the node name and other Kubernetes attributes with the metrics collected by the hostmetrics
receiver [1][2][3][4][5].
To continue talking to Dosu, mention @dosu.
@dfsdevops the k8sattributes processor works by associating incoming telemetry, specifically the Resource
or request context IP, with the kubernetes pod metadata it is remembering.
In the case of hostmetrics telemetry there isn't anything on in data to associate it to the kubernetes data. It has been a while since I ran the hostmetrics receiver on k8s (kubeletstats has always covered my needs), but I don't believe it is able to get per-pod metrics - all the metrics it collectors are for the entire node.
Is there a specific attribute you'd like to set for this data? Maybe the resourcedetection would better suit your needs if you're trying to add stuff like k8s.node.name.
@TylerHelmuth thank you for clarifying, this could be a misunderstanding on my part. I thought it was able to associate a bare minimum of metadata in the way kubeletstats on the daemonset instances did the same. k8s.node.name is the most important for me as nearly everything else in my metrics has that, being able to cross-reference it would be useful, right now there is nothing in the event that distinguishes each node from one another in hostmetrics, I'm surprised there isn't something like at least a hostname attribute already baked into the hostmetrics preset. I will try messing around with resourcedetection or some other way to manually associate the node name to the data (maybe via the env var?).
Here's an example of what one of the metrics looks like as an event in honeycomb currently, only four columns with no way for me to tie it back to a specific node. Ideally everything in this metric would have a common attribute I could use cross-reference and merge everything I possibly can into a single honeycomb event because high-cardinality is encouraged by the platform. I know thats more of a honeycomb-specific thing but I thought it might help provide context.
I found a configuration I am happy with.
Explanation: Firstly, I removed the preset because I don't want everything it's sending over and it wasn't clear how to disable it from the preset, for now I'm only enabling load.
I created a separate pipeline to add the k8s.node.name attribute via an env var source in a resourcedetection
processor. Then used the forward connector to send it into the main metrics pipeline so it's otherwise unified with those processors/exporters.
mode: daemonset
fullnameOverride: otel-collector-agent
image:
repository: "otel/opentelemetry-collector-k8s"
# Required to use the kubeletstats cpu/memory utilization metrics
clusterRole:
create: true
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
verbs:
- get
extraEnvs:
- name: HONEYCOMB_API_KEY
valueFrom:
secretKeyRef:
name: honeycomb
key: api-key
- name: OTEL_RESOURCE_ATTRIBUTES
value: "k8s.node.name=$(K8S_NODE_NAME)"
extraVolumes:
- name: hostfs
hostPath:
path: /
extraVolumeMounts:
- name: hostfs
mountPath: /hostfs
readOnly: true
mountPropagation: HostToContainer
presets:
# enables the k8sattributesprocessor and adds it to the traces, metrics, and logs pipelines
kubernetesAttributes:
enabled: true
extractAllPodLabels: true
extractAllPodAnnotations: true
# enables the kubeletstatsreceiver and adds it to the metrics pipelines
kubeletMetrics:
enabled: true
logsCollection:
enabled: true
config:
connectors:
forward:
receivers:
jaeger: null
zipkin: null
hostmetrics:
collection_interval: 30s
root_path: /hostfs
scrapers:
load:
cpu_average: true # divide by number of cores, better for generalized figures to alert on
kubeletstats:
collection_interval: 30s
metric_groups:
- pod
- node
- volume
extra_metadata_labels:
- k8s.volume.type
metrics:
k8s.node.uptime:
enabled: true
k8s.pod.uptime:
enabled: true
k8s.pod.cpu_limit_utilization:
enabled: true
k8s.pod.cpu_request_utilization:
enabled: true
k8s.pod.memory_limit_utilization:
enabled: true
k8s.pod.memory_request_utilization:
enabled: true
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector # self metrics
scrape_interval: 300s
static_configs:
- targets:
- ${env:MY_POD_IP}:8888
exporters:
debug:
verbosity: detailed
sampling_initial: 5
sampling_thereafter: 200
otlp:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${env:HONEYCOMB_API_KEY}"
otlp/k8s-metrics:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${env:HONEYCOMB_API_KEY}"
"x-honeycomb-dataset": "k8s-metrics"
otlp/k8s-logs:
endpoint: "api.honeycomb.io:443"
headers:
"x-honeycomb-team": "${env:HONEYCOMB_API_KEY}"
"x-honeycomb-dataset": "k8s-logs"
processors:
resourcedetection:
detectors:
- env
filter/ottl:
error_mode: ignore
metrics:
datapoint:
- 'resource.attributes["k8s.volume.type"] != nil and resource.attributes["k8s.volume.type"] != "persistentVolumeClaim"'
service:
# telemetry:
# logs:
# level: debug
pipelines:
traces:
receivers: [otlp]
exporters: [otlp]
metrics/hostmetrics:
receivers: [hostmetrics]
processors:
- resourcedetection
exporters: [forward]
metrics:
receivers:
- otlp
- prometheus
- forward
- kubeletstats
exporters:
- otlp/k8s-metrics
# uncomment the following line to enable debug logging for metrics
# - debug
processors:
- memory_limiter
- batch
- filter/ottl
- k8sattributes
logs:
exporters: [otlp/k8s-logs]
ports:
jaeger-compact:
enabled: false
jaeger-thrift:
enabled: false
jaeger-grpc:
enabled: false
zipkin:
enabled: false
After adding the hostmetrics receiver, I do not see any attributes to associate new metrics to a particular node. Verifying this by visualizing the events in honeycomb. Is there a configuration that allows associating the node name to hostmetrics? My use-case here is to cover some stats that are not covered by kubeletstats (CPU load, etc.)
My values file is as follows: