open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.07k stars 2.37k forks source link

k8sattributes processor evaluating pod identifier result empty field #29630

Open vnStrawHat opened 11 months ago

vnStrawHat commented 11 months ago

Component(s)

processor/k8sattributes

Describe the issue you're reporting

Hello everyone,

I try to using k8sattributes processor on Rancher rke cluster but the debug log show all field empty:

2023-12-04T08:06:52.407Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}

K8s environment:

Otel Collector pod init by Opentelemetry Operator with config bellow:

Collector.yaml

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-collector
spec:
  image: opentelemetry-collector-contrib:0.89.0
  resources:
    limits:
      memory: 3Gi
  imagePullPolicy: IfNotPresent
  replicas: 3
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: false
        extract:
          metadata:
          - k8s.namespace.name
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.cronjob.name
          - k8s.job.name
          - k8s.node.name
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.pod.start_time
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.namespace.name
          - from: resource_attribute
            name: k8s.pod.name
    connectors:
      spanmetrics:
        exemplars:
          enabled: true
        metrics_flush_interval: 10s
        dimensions:
          - name: http.route
          - name: http.method
          - name: http.host
          - name: http.status_code
          - name: k8s.pod.name
          - name: k8s.node.name
          - name: k8s.deployment.name
          - name: k8s.namespace.name
          - name: db.system
          - name: db.name
    exporters:
      otlp:
        endpoint: https://jaegercollector:32443
        tls:
          insecure: true
          insecure_skip_verify: true
      otlp/signoz:
        endpoint: https://signoz:32443
        tls:
          insecure: true
          insecure_skip_verify: true          
      prometheusremotewrite:
        endpoint: https://vminsert:32443/insert/0/prometheus/api/v1/write
        tls:
          insecure_skip_verify: true
        external_labels:
          datacenter: K8s-TEST
        target_info:
          enabled: true
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch, k8sattributes]
          exporters: [otlp, otlp/signoz, spanmetrics]
        metrics:
          receivers: [otlp, spanmetrics]
          processors: [batch, k8sattributes]
          exporters: [prometheusremotewrite]
      telemetry:
        metrics:
          address: ":8888"
        logs:
          level: "debug"

services_account.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector
rules:
- apiGroups: ["*"]
  resources: ["pods", "namespaces", "nodes"]
  verbs: ["get", "watch", "list"]
- apiGroups: ["apps"]
  resources: ["replicasets"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
  resources: ["replicasets"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-collector
subjects:
- kind: ServiceAccount
  name: otel-collector-collector
  namespace: monitor
roleRef:
  kind: ClusterRole
  name: otel-collector
  apiGroup: rbac.authorization.k8s.io

I was try to change pod_association to

- sources:
  - from: resource_attribute
    name: k8s.pod.name
  - from: resource_attribute
    name: k8s.namespace.name
- sources:
   - from: resource_attribute
      name: k8s.pod.ip
- sources:
  - from: resource_attribute
    name: k8s.pod.uid
- sources:
  - from: connection

but result is same

Same config work as expected in native k8s cluster (without rancher).

Debug log do not show to much information able to use to debug k8sattributes processor.

Pod log:

2023-12-04T08:27:02.228Z    info    service@v0.89.0/telemetry.go:85 Setting up own telemetry...
2023-12-04T08:27:02.229Z    info    service@v0.89.0/telemetry.go:202    Serving Prometheus metrics  {"address": ":8888", "level": "Basic"}
2023-12-04T08:27:02.229Z    debug   exporter@v0.89.0/exporter.go:273    Beta component. May change in the future.   {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite"}
2023-12-04T08:27:02.229Z    debug   processor@v0.89.0/processor.go:287  Beta component. May change in the future.   {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics"}
2023-12-04T08:27:02.230Z    info    kube/client.go:113  k8s filtering   {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics", "labelSelector": "", "fieldSelector": ""}
2023-12-04T08:27:02.230Z    debug   processor@v0.89.0/processor.go:287  Stable component.   {"kind": "processor", "name": "batch", "pipeline": "metrics"}
2023-12-04T08:27:02.230Z    debug   processor@v0.89.0/processor.go:287  Alpha component. May change in the future.  {"kind": "processor", "name": "transform", "pipeline": "metrics"}
2023-12-04T08:27:02.230Z    debug   processor@v0.89.0/processor.go:287  Alpha component. May change in the future.  {"kind": "processor", "name": "filter/ottl", "pipeline": "metrics"}
2023-12-04T08:27:02.231Z    debug   receiver@v0.89.0/receiver.go:294    Stable component.   {"kind": "receiver", "name": "otlp", "data_type": "metrics"}
2023-12-04T08:27:02.231Z    debug   exporter@v0.89.0/exporter.go:273    Stable component.   {"kind": "exporter", "data_type": "traces", "name": "otlp"}
2023-12-04T08:27:02.231Z    debug   exporter@v0.89.0/exporter.go:273    Stable component.   {"kind": "exporter", "data_type": "traces", "name": "otlp/signoz"}
2023-12-04T08:27:02.231Z    debug   connector@v0.89.0/connector.go:634  Alpha component. May change in the future.  {"kind": "connector", "name": "spanmetrics", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics"}
2023-12-04T08:27:02.231Z    info    spanmetricsconnector@v0.89.0/connector.go:105   Building spanmetrics connector  {"kind": "connector", "name": "spanmetrics", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics"}
2023-12-04T08:27:02.231Z    debug   processor@v0.89.0/processor.go:287  Beta component. May change in the future.   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces"}
2023-12-04T08:27:02.231Z    info    kube/client.go:113  k8s filtering   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "labelSelector": "", "fieldSelector": ""}
2023-12-04T08:27:02.231Z    debug   processor@v0.89.0/processor.go:287  Stable component.   {"kind": "processor", "name": "batch", "pipeline": "traces"}
2023-12-04T08:27:02.231Z    debug   receiver@v0.89.0/receiver.go:294    Stable component.   {"kind": "receiver", "name": "otlp", "data_type": "traces"}
2023-12-04T08:27:02.234Z    info    service@v0.89.0/service.go:143  Starting otelcol-contrib... {"Version": "0.89.0", "NumCPU": 10}
2023-12-04T08:27:02.234Z    info    extensions/extensions.go:34 Starting extensions...
2023-12-04T08:27:02.234Z    warn    internal@v0.89.0/warning.go:40  Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks    {"kind": "receiver", "name": "otlp", "data_type": "metrics", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-12-04T08:27:02.235Z    info    zapgrpc/zapgrpc.go:178  [core] [Server #1] Server created   {"grpc_log": true}
2023-12-04T08:27:02.235Z    info    otlpreceiver@v0.89.0/otlp.go:83 Starting GRPC server    {"kind": "receiver", "name": "otlp", "data_type": "metrics", "endpoint": "0.0.0.0:4317"}
2023-12-04T08:27:02.240Z    warn    internal@v0.89.0/warning.go:40  Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks    {"kind": "receiver", "name": "otlp", "data_type": "metrics", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-12-04T08:27:02.240Z    info    otlpreceiver@v0.89.0/otlp.go:101    Starting HTTP server    {"kind": "receiver", "name": "otlp", "data_type": "metrics", "endpoint": "0.0.0.0:4318"}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] Channel created {"grpc_log": true}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Server #1 ListenSocket #2] ListenSocket created {"grpc_log": true}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] original dial target is: "jaegercollector-ha.xxx.xxx.xxx:32443" {"grpc_log": true}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] parsed dial target is: {URL:{Scheme:jaegercollector-ha.xxx.xxx.xxx Opaque:32443 User: Host: Path: RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}   {"grpc_log": true}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] fallback to scheme "passthrough"    {"grpc_log": true}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] parsed dial target is: {URL:{Scheme:passthrough Opaque: User: Host: Path:/jaegercollector-ha.xxx.xxx.xxx:32443 RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}  {"grpc_log": true}
2023-12-04T08:27:02.240Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] Channel authority set to "jaegercollector-ha.xxx.xxx.xxx:32443" {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] Resolver state updated: {
  "Addresses": [
    {
      "Addr": "jaegercollector-ha.xxx.xxx.xxx:32443",
      "ServerName": "",
      "Attributes": null,
      "BalancerAttributes": null,
      "Metadata": null
    }
  ],
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "jaegercollector-ha.xxx.xxx.xxx:32443",
          "ServerName": "",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
} (resolver returned new addresses) {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] Channel switches to new LB policy "pick_first"  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [pick-first-lb 0xc002c0e6f0] Received new config {
  "shuffleAddressList": false
}, resolver state {
  "Addresses": [
    {
      "Addr": "jaegercollector-ha.xxx.xxx.xxx:32443",
      "ServerName": "",
      "Attributes": null,
      "BalancerAttributes": null,
      "Metadata": null
    }
  ],
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "jaegercollector-ha.xxx.xxx.xxx:32443",
          "ServerName": "",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
}   {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3 SubChannel #4] Subchannel created    {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] Channel Connectivity change to CONNECTING   {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3 SubChannel #4] Subchannel Connectivity change to CONNECTING  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3 SubChannel #4] Subchannel picks a new address "jaegercollector-ha.xxx.xxx.xxx:32443" to connect  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] Channel created {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] original dial target is: "sentry-ha.xxx.xxx.xxx:32443"  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] parsed dial target is: {URL:{Scheme:sentry-ha.xxx.xxx.xxx Opaque:32443 User: Host: Path: RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}    {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] fallback to scheme "passthrough"    {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] parsed dial target is: {URL:{Scheme:passthrough Opaque: User: Host: Path:/sentry-ha.xxx.xxx.xxx:32443 RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}   {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] Channel authority set to "sentry-ha.xxx.xxx.xxx:32443"  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [pick-first-lb 0xc002c0e6f0] Received SubConn state update: 0xc002c0e8d0, {ConnectivityState:CONNECTING ConnectionError:<nil>}   {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] Resolver state updated: {
  "Addresses": [
    {
      "Addr": "sentry-ha.xxx.xxx.xxx:32443",
      "ServerName": "",
      "Attributes": null,
      "BalancerAttributes": null,
      "Metadata": null
    }
  ],
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "sentry-ha.xxx.xxx.xxx:32443",
          "ServerName": "",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
} (resolver returned new addresses) {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] Channel switches to new LB policy "pick_first"  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [pick-first-lb 0xc002e025d0] Received new config {
  "shuffleAddressList": false
}, resolver state {
  "Addresses": [
    {
      "Addr": "sentry-ha.xxx.xxx.xxx:32443",
      "ServerName": "",
      "Attributes": null,
      "BalancerAttributes": null,
      "Metadata": null
    }
  ],
  "Endpoints": [
    {
      "Addresses": [
        {
          "Addr": "sentry-ha.xxx.xxx.xxx:32443",
          "ServerName": "",
          "Attributes": null,
          "BalancerAttributes": null,
          "Metadata": null
        }
      ],
      "Attributes": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
}   {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5 SubChannel #6] Subchannel created    {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] Channel Connectivity change to CONNECTING   {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5 SubChannel #6] Subchannel Connectivity change to CONNECTING  {"grpc_log": true}
2023-12-04T08:27:02.241Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5 SubChannel #6] Subchannel picks a new address "sentry-ha.xxx.xxx.xxx:32443" to connect   {"grpc_log": true}
2023-12-04T08:27:02.242Z    info    spanmetricsconnector@v0.89.0/connector.go:176   Starting spanmetrics connector  {"kind": "connector", "name": "spanmetrics", "exporter_in_pipeline": "traces", "receiver_in_pipeline": "metrics"}
2023-12-04T08:27:02.242Z    info    zapgrpc/zapgrpc.go:178  [core] [pick-first-lb 0xc002e025d0] Received SubConn state update: 0xc002e02690, {ConnectivityState:CONNECTING ConnectionError:<nil>}   {"grpc_log": true}
2023-12-04T08:27:02.242Z    info    service@v0.89.0/service.go:169  Everything is ready. Begin running and processing data.
2023-12-04T08:27:02.269Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3 SubChannel #4] Subchannel Connectivity change to READY   {"grpc_log": true}
2023-12-04T08:27:02.269Z    info    zapgrpc/zapgrpc.go:178  [core] [pick-first-lb 0xc002c0e6f0] Received SubConn state update: 0xc002c0e8d0, {ConnectivityState:READY ConnectionError:<nil>}    {"grpc_log": true}
2023-12-04T08:27:02.269Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #3] Channel Connectivity change to READY    {"grpc_log": true}
2023-12-04T08:27:02.280Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5 SubChannel #6] Subchannel Connectivity change to READY   {"grpc_log": true}
2023-12-04T08:27:02.280Z    info    zapgrpc/zapgrpc.go:178  [core] [pick-first-lb 0xc002e025d0] Received SubConn state update: 0xc002e02690, {ConnectivityState:READY ConnectionError:<nil>}    {"grpc_log": true}
2023-12-04T08:27:02.280Z    info    zapgrpc/zapgrpc.go:178  [core] [Channel #5] Channel Connectivity change to READY    {"grpc_log": true}
2023-12-04T08:27:04.036Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-12-04T08:27:09.048Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-12-04T08:27:12.257Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-12-04T08:27:12.257Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-12-04T08:27:14.061Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-12-04T08:27:19.073Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}
2023-12-04T08:27:19.073Z    debug   k8sattributesprocessor@v0.89.0/processor.go:102 evaluating pod identifier   {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "value": [{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""},{"Source":{"From":"","Name":""},"Value":""}]}

my config is correct ? have any option to get more log of k8sattributes processor for debug ? Anyone success to use k8sattributes in Rancher RKE ?

github-actions[bot] commented 11 months ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

fculpo commented 11 months ago

Hi,

Same issue here on a k3s cluster (rancher like)

fculpo commented 11 months ago

Can be mitigated by adding needed envvars to pods directly:

env:
    - name: OTEL_SERVICE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: "metadata.labels['app.kubernetes.io/component']"
    - name: OTEL_K8S_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: OTEL_K8S_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: OTEL_K8S_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: OTEL_K8S_POD_UID
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.uid
    - name: OTEL_RESOURCE_ATTRIBUTES
      value: service.name=$(OTEL_SERVICE_NAME),service.instance.id=$(OTEL_K8S_POD_UID),service.namespace=$(OTEL_K8S_NAMESPACE),k8s.namespace.name=$(OTEL_K8S_NAMESPACE),k8s.node.name=$(OTEL_K8S_NODE_NAME),k8s.pod.name=$(OTEL_K8S_POD_NAME)
vnStrawHat commented 11 months ago

Thanks @fculpo ,

Is there any way to get "deployment name" ?

atoulme commented 11 months ago

Do you have a way to reproduce this with a simple setup?

fculpo commented 10 months ago

Hi, spawning a simple k3s cluster with the k8sattributes processor should not enhance spans with metadatas. On a "standard" k8s cluster you should have those metadatas.

fculpo commented 10 months ago

Thanks @fculpo ,

Is there any way to get "deployment name" ?

I've not tested yet, I focused on configuring processor on our standard clusters for now.

TylerHelmuth commented 9 months ago

@joegoldman2 what kind of k8s cluster are you using? I have still not been able to reproduce this issue.

TylerHelmuth commented 9 months ago

The debug message is coming from here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/2f55fec09dfca011191fb892bf6b496a167ab957/processor/k8sattributesprocessor/processor.go#L122.

The preset's pod associations are:

pod_association:
    - sources:
      - from: resource_attribute
        name: k8s.pod.ip
    - sources:
      - from: resource_attribute
        name: k8s.pod.uid
    - sources:
      - from: connection

@vnStrawHat in your config you've removed the connection source. You won't get any associated telemetry unless the incoming telemetry has k8s.pod.name or k8s.namespace.name on your resource attributres.

@joegoldman2 for you it looks like your data doesn't contain k8s.pod.ip, k8s.pod.uid and isn't able to retrieve the incoming request's IP.

TylerHelmuth commented 9 months ago

I was also unable to reproduce with the latest minikube and collector.

fculpo commented 9 months ago

Can you reproduce on rancher based k8s ? (ie. k3s) I could not get any metadata while any GKE,AKS,AKS clusters where fine

On Tue, Jan 30, 2024 at 5:15 PM Tyler Helmuth @.***> wrote:

I was also unable to reproduce with the latest minikube and collector.

— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/29630#issuecomment-1917397161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACPL7JYOXC2GSYDQX7WRMTYREMDBAVCNFSM6AAAAABAFSEH7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJXGM4TOMJWGE . You are receiving this because you were mentioned.Message ID: <open-telemetry/opentelemetry-collector-contrib/issues/29630/1917397161@ github.com>

TylerHelmuth commented 9 months ago

AKS clusters where fine

To clarify, you were able to get AKS to populate correctly?

fculpo commented 9 months ago

I had Grafana Tempo on AKS and instrumenting itself to Grafana Agent (using the k8sattributes processor) which was working, and was surprised that k3s did not, even if trying a lot of processor configuration.

On Tue, Jan 30, 2024 at 5:26 PM Tyler Helmuth @.***> wrote:

AKS clusters where fine

To clarify, you were able to get AKS to populate correctly?

— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/29630#issuecomment-1917420177, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACPL7KJMCDU4GRELKHXFZ3YRENLFAVCNFSM6AAAAABAFSEH7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJXGQZDAMJXG4 . You are receiving this because you were mentioned.Message ID: <open-telemetry/opentelemetry-collector-contrib/issues/29630/1917420177@ github.com>

TylerHelmuth commented 9 months ago

Ok cool, this was my suspicion. I have no idea why the direct pod connections isn't working as expected, could be something unique to AKS setup. @jinja2 any ideas?

jinja2 commented 9 months ago

I haven't looked at IPAM/network setup in AKS specifically, but I would guess the pod IP might be getting SNAT'd to that of the node's primary ip address, possibly due to pod cidr not being routable in the Azure subnet. I would suggest looking at your cluster's networking setup to understand why this is happening and if AKS provides a CNI option to preserve pod ip.

TylerHelmuth commented 9 months ago

@jinja2 I am so glad you are part of this project because I don't know any of the networking/infra stuff you just said lol

github-actions[bot] commented 7 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 5 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

jseiser commented 5 months ago

has anyone found a way to make this work in EKS?

When doing the below, it almost works, except it then causes all of the traces from the service mesh to no longer be associated.

Can be mitigated by adding needed envvars to pods directly:

env:
    - name: OTEL_SERVICE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: "metadata.labels['app.kubernetes.io/component']"
    - name: OTEL_K8S_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: OTEL_K8S_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: OTEL_K8S_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: OTEL_K8S_POD_UID
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.uid
    - name: OTEL_RESOURCE_ATTRIBUTES
      value: service.name=$(OTEL_SERVICE_NAME),service.instance.id=$(OTEL_K8S_POD_UID),service.namespace=$(OTEL_K8S_NAMESPACE),k8s.namespace.name=$(OTEL_K8S_NAMESPACE),k8s.node.name=$(OTEL_K8S_NODE_NAME),k8s.pod.name=$(OTEL_K8S_POD_NAME)
omri-shilton commented 4 months ago

we are using eks with collector in deployment mode deployment and in daemonset, and also experiencing this issue.

freefood89 commented 2 months ago

Hey I think I figured it out. You need to make sure that your k8sattributes processors run first [source]

connection: Takes the IP attribute from connection context (if available). In this case the processor must appear before any batching or tail sampling, which remove this information.

so for your case, swap batch and k8sattributes

    service:
      pipelines:
        traces:
          receivers: [ ... ]
          processors: [k8sattributes, batch]
          exporters: [ ... ]

I also had a problem with an istio sidecar confusing the collector with the sidecar loopback IP 127.0.0.6. I'm not sure if the pod_association handles the X-Forwarded-For header so I just disabled istio on the collector pod

jseiser commented 1 month ago

@freefood89

That did not fix anything on our end.

freefood89 commented 1 month ago

@jseiser sorry to hear that. My answer was intended for the original question rather than for your use case in Eks. However the processor order needs to be correct in EKS too.

Maybe you can set the log level to INFO and share it. That's how I figured out what was happening