signalfx / splunk-otel-collector-chart

Splunk OpenTelemetry Collector for Kubernetes
Apache License 2.0
119 stars 148 forks source link

Pod labels for splunk source and sourcetype overwrite not being respected #862

Closed saiharshitachava closed 1 year ago

saiharshitachava commented 1 year ago

What happened?

We installed the latest version of 0.81.0 chart and I see the custom pod labels to overwrite splunk source and scourctype isnt working but the index is working

I know the docs say we need to switch to annotations

But is there a back door (may be not via helm) to do this source and sourcetype overwrite from custom pod labels?

Chart version

0.81.0

Environment information

EKS 1.25

Chart configuration

No response

Log output

No response

Additional context

No response

rmfitzpatrick commented 1 year ago

Can you please provide any relevant values configuration and (redacted) examples of the observed change in behavior?

saiharshitachava commented 1 year ago

@rmfitzpatrick thanks for looking into this I have made some changes directly in the config map to accommodate customizations. Rest all are default settings I have been successful in overriding the source by giving splunk.com/source at pod annotation which wasn't working(using default configs) with below

The second customization I have is I need splunk.abc.com/source and splunk.abc.com/sourcetype labels at pod level should be able to overwrite the default source and sourcetype of pod which isnt working but splunk.abc.com/spulnk-index is working perfectly fine..thats also my customization below

here is the configmap

data:
  relay: |
    exporters:
      splunk_hec/platform_logs:
        disable_compression: true
        endpoint: ${SPLUNK_PLATFORM_HEC}
        idle_conn_timeout: 10s
        index: test_main
        max_connections: 200
        profiling_data_enabled: false
        retry_on_failure:
          enabled: true
          initial_interval: 5s
          max_elapsed_time: 300s
          max_interval: 30s
        sending_queue:
          enabled: false
          num_consumers: 10
          queue_size: 5000
        source: kubernetes
        splunk_app_name: splunk-otel-collector
        splunk_app_version: 0.81.0
        timeout: 10s
        tls:
          insecure_skip_verify: true
        token: ${SPLUNK_PLATFORM_HEC_TOKEN}
    extensions:
      file_storage:
        directory: /var/addon/splunk/otel_pos
      health_check: null
      k8s_observer:
        auth_type: serviceAccount
        node: ${K8S_NODE_NAME}
      memory_ballast:
        size_mib: ${SPLUNK_BALLAST_SIZE_MIB}
      zpages: null
    processors:
      batch: null
      filter/logs:
        logs:
          exclude:
            match_type: strict
            resource_attributes:
            - key: splunk.com/exclude
              value: "true"
      k8sattributes:
        extract:
          annotations:
          - from: pod
            key: splunk.com/source
          - from: pod
            key: splunk.com/sourcetype
          - from: namespace
            key: splunk.com/exclude
            tag_name: splunk.com/exclude
          - from: pod
            key: splunk.com/exclude
            tag_name: splunk.com/exclude
          - from: namespace
            key: splunk.com/index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.com/index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.abc.com/source
          - from: pod
            key: splunk.abc.com/sourcetype
          labels:
          - key: app
            tag_name: app
          - key: abc.com/dc
            tag_name: dc
          - key: helm.sh/chart
            tag_name: chart
          - key: abc.com/env-type
            tag_name: envtype
          - key: app.kubernetes.io/instance
            tag_name: instance
          - key: abc.com/env
            tag_name: env
          - key: app.kubernetes.io/name
            tag_name: name
          - key: abc.com/project
            tag_name: project
          - key: app.kubernetes.io/version
            tag_name: version
          - from: pod
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.abc.com/source
            tag_name: com.splunk.source
          - from: pod
            key: splunk.abc.com/sourcetype
            tag_name: com.splunk.sourcetype
          - from: namespace
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index
          metadata:
          - k8s.namespace.name
          - k8s.node.name
          - k8s.pod.name
          - k8s.pod.uid
          - container.id
          - container.image.name
          - container.image.tag
        filter:
          node_from_env_var: K8S_NODE_NAME
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: ip
        - sources:
          - from: connection
        - sources:
          - from: resource_attribute
            name: host.name
      memory_limiter:
        check_interval: 2s
        limit_mib: ${SPLUNK_MEMORY_LIMIT_MIB}
      resource:
        attributes:
        - action: insert
          key: k8s.node.name
          value: ${K8S_NODE_NAME}
        - action: upsert
          key: k8s.cluster.name
          value: ${K8S_CLUSTER_NAME}
        - action: upsert
          from_attribute: k8s.cluster.name
          key: cluster_name
        - action: delete
          key: k8s.cluster.name
      resource/add_agent_k8s:
        attributes:
        - action: insert
          key: k8s.pod.name
          value: ${K8S_POD_NAME}
        - action: insert
          key: k8s.pod.uid
          value: ${K8S_POD_UID}
        - action: insert
          key: k8s.namespace.name
          value: ${K8S_NAMESPACE}
      resource/logs:
        attributes:
        - action: upsert
          from_attribute: k8s.pod.annotations.splunk.com/sourcetype
          key: com.splunk.sourcetype
        - action: upsert
          from_attribute: k8s.pod.annotations.splunk.com/source
          key: com.splunk.source
        - action: upsert
          from_attribute: k8s.pod.labels.splunk.abc.com/sourcetype
          key: com.splunk.sourcetype
        - action: upsert
          from_attribute: k8s.pod.labels.splunk.abc.com/source
          key: com.splunk.source
        - action: delete
          key: k8s.pod.annotations.splunk.com/sourcetype
        - action: delete
          key: splunk.com/exclude
        - action: upsert
          from_attribute: k8s.container.name
          key: container_name
        - action: upsert
          from_attribute: container.id
          key: container_id
        - action: upsert
          from_attribute: k8s.pod.name
          key: pod
        - action: upsert
          from_attribute: k8s.pod.uid
          key: pod_uid
        - action: upsert
          from_attribute: k8s.namespace.name
          key: namespace
        - action: upsert
          from_attribute: k8s.pod.labels.app
          key: label_app
        - action: delete
          key: k8s.container.name
        - action: delete
          key: container.id
        - action: delete
          key: k8s.pod.name
        - action: delete
          key: k8s.pod.uid
        - action: delete
          key: k8s.namespace.name
        - action: delete
          key: k8s.pod.labels.app
      resourcedetection:
        detectors:
        - env
        - eks
        - ec2
        - system
        override: true
        timeout: 10s
      transform/logs:
        log_statements:
        - context: log
          statements:
          - set(resource.attributes["container_image"], Concat([resource.attributes["container.image.name"],
            resource.attributes["container.image.tag"]], ":"))
    receivers:
      filelog:
        encoding: utf-8
        exclude:
        - /var/log/pods/otel_splunk-otel-collector*_*/otel-collector/*.log
        fingerprint_size: 1kb
        force_flush_period: "0"
        include:
        - /var/log/pods/*/*/*.log
        include_file_name: false
        include_file_path: true
        max_concurrent_files: 1024
        max_log_size: 1MiB
        operators:
        - id: parser-containerd
          regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
          timestamp:
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
            parse_from: attributes.time
          type: regex_parser
        - combine_field: attributes.log
          combine_with: ""
          id: containerd-recombine
          is_last_entry: attributes.logtag == 'F'
          max_log_size: 0
          output: handle_empty_log
          source_identifier: attributes["log.file.path"]
          type: recombine
        - field: attributes.log
          id: handle_empty_log
          if: attributes.log == nil
          type: add
          value: ""
        - parse_from: attributes["log.file.path"]
          regex: ^\/var\/log\/pods\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[^\/]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
          type: regex_parser
        - from: attributes.uid
          to: resource["k8s.pod.uid"]
          type: move
        - from: attributes.restart_count
          to: resource["k8s.container.restart_count"]
          type: move
        - from: attributes.container_name
          to: resource["k8s.container.name"]
          type: move
        - from: attributes.namespace
          to: resource["k8s.namespace.name"]
          type: move
        - from: attributes.pod_name
          to: resource["k8s.pod.name"]
          type: move
        - field: resource["com.splunk.sourcetype"]
          type: add
          value: EXPR("kube:container:"+resource["k8s.container.name"])
        - from: attributes.stream
          to: attributes["log.iostream"]
          type: move
        - value: EXPR(attributes["log.file.path"])
          field: resource["com.splunk.source"]
          type: add
        - default: clean-up-log-record
          routes:
          - expr: (resource["k8s.namespace.name"]) matches ".*" && (resource["k8s.pod.name"])
              matches ".*"
            output: .*_.*
          type: router
        - combine_field: attributes.log
          id: .*_.*
          is_first_entry: (attributes.log) matches "^[^\\s].*"
          max_log_size: 0
          output: clean-up-log-record
          source_identifier: resource["com.splunk.source"]
          type: recombine
        - from: attributes.log
          id: clean-up-log-record
          to: body
          type: move
        poll_interval: 200ms
        retry_on_failure:
          enabled: true
        start_at: beginning
        storage: file_storage
      fluentforward:
        endpoint: 0.0.0.0:8006
      journald/containerd:
        directory: /var/log/journal
        operators:
        - field: resource["com.splunk.source"]
          type: add
          value: /var/log/journal
        - field: resource["com.splunk.sourcetype"]
          type: add
          value: kube:containerd
        - field: resource["com.splunk.index"]
          type: add
          value: docker_main
        - field: resource["host.name"]
          type: add
          value: EXPR(env("K8S_NODE_NAME"))
        - field: resource["journald.priority.number"]
          type: add
          value: EXPR(body.PRIORITY)
        - field: resource["journald.unit.name"]
          type: add
          value: EXPR(body._SYSTEMD_UNIT)
        - from: body.MESSAGE
          id: set-body
          to: body
          type: move
        priority: info
        storage: file_storage
        units:
        - containerd
      journald/kubelet:
        directory: /var/log/journal
        operators:
        - field: resource["com.splunk.source"]
          type: add
          value: /var/log/journal
        - field: resource["com.splunk.sourcetype"]
          type: add
          value: kube:kubelet
        - field: resource["com.splunk.index"]
          type: add
          value: docker_main
        - field: resource["host.name"]
          type: add
          value: EXPR(env("K8S_NODE_NAME"))
        - field: resource["journald.priority.number"]
          type: add
          value: EXPR(body.PRIORITY)
        - field: resource["journald.unit.name"]
          type: add
          value: EXPR(body._SYSTEMD_UNIT)
        - from: body.MESSAGE
          id: set-body
          to: body
          type: move
        priority: info
        storage: file_storage
        units:
        - kubelet
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
      prometheus/agent:
        config:
          scrape_configs:
          - job_name: otel-agent
            scrape_interval: 10s
            static_configs:
            - targets:
              - ${K8S_POD_IP}:8889
    service:
      extensions:
      - file_storage
      - health_check
      - k8s_observer
      - memory_ballast
      - zpages
      pipelines:
        logs:
          exporters:
          - splunk_hec/platform_logs
          processors:
          - memory_limiter
          - k8sattributes
          - filter/logs
          - batch
          - resourcedetection
          - resource
          - transform/logs
          - resource/logs
          receivers:
          - filelog
          - fluentforward
          - otlp
        logs/host:
          exporters:
          - splunk_hec/platform_logs
          processors:
          - memory_limiter
          - batch
          - resource
          receivers:
          - journald/kubelet
          - journald/containerd
      telemetry:
        metrics:
          address: 0.0.0.0:8889
saiharshitachava commented 1 year ago

@rmfitzpatrick any luck here

omrozowicz-splunk commented 1 year ago

I took a look at it and looks like com.splunk.source and com.splunk.sourcetype works only if we remove it first from filelog extraOperators. My config:

logsCollection:
  containers:
    extraOperators:
      - type: remove
        field: resource["com.splunk.source"]
      - type: remove
        field: resource["com.splunk.sourcetype"]
extraAttributes:
  fromLabels:
    - key: splunk.abc.com/splunk-index
      tag_name: com.splunk.index
      from: pod
    - key: splunk.abc.com/splunk-source
      tag_name: com.splunk.source
      from: pod
    - key: splunk.abc.com/splunk-sourcetype
      tag_name: com.splunk.sourcetype
      from: pod

Note that remove requires an additional if statement to be applicable only to pods you're interested in, otherwise it will remove sourcetypes and source from all the data.

I'll try to see if there's a better way and/or why it works differently for index, but this is the best I have so far.

saiharshitachava commented 1 year ago

@omrozowicz-splunk Is there an example around this condition ?

omrozowicz-splunk commented 1 year ago

@omrozowicz-splunk Is there an example around this condition ?

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/remove.md Here you can refer to if field. An example of such a case would be for example: if: 'resource["k8s.pod.name"] contains "mongo"'

saiharshitachava commented 1 year ago

Thanks @omrozowicz-splunk

One more quick question I extracted these custom labels from pod how do I access the values of that label key?

Lets say this is extracted splunk.abc.com/splunk-sourcetype how do I refer to the value of this under operators section?

I tried with resource ,attribute along with EXPR nothing seems to be working

Tried combinations EXPR(resource[" k8s.pod.labels.splunk.abc.com/sourcetype"]) EXPR(attributes. k8s.pod.labels.splunk.abc.com/sourcetype)

omrozowicz-splunk commented 1 year ago

I am not sure now, but I think it is either resource["k8s.pod.labels.splunk.abc.com/sourcetype"] or attributes[" k8s.pod.labels.splunk.abc.com/sourcetype"]. What exactly do you want to achieve? It will be in if or as a field in extraOperators?

saiharshitachava commented 1 year ago

im thinking to remove source if custom labels field is not nil and assign the value in labels key to source

i tried keeping these both resource["k8s.pod.labels.splunk.abc.com/sourcetype"] attributes[" k8s.pod.labels.splunk.abc.com/sourcetype"]as value with field resource["com.splunk.source"] with operator add but its been considered as absolute string and not getting the value in it ..so couldnt do much for above logic

saiharshitachava commented 1 year ago

Anything on this? If we atleast know which version doesnt have this bug I can switch to that for now

omrozowicz-splunk commented 1 year ago

I think to chceck if we even have custom labels it's better to do soemthing like "k8s.pod.labels.splunk.abc.com/sourcetype" in attributes or "k8s.pod.labels.splunk.abc.com/sourcetype" in resource, that if statement language follows this specification: https://github.com/antonmedv/expr/blob/master/docs/Language-Definition.md

Can you tell me which version is the last that worked for you?

saiharshitachava commented 1 year ago

0.43.2 is the last version of helm chart thats worked for us

omrozowicz-splunk commented 1 year ago

@saiharshitachava Ok, actually I have a solution that is much easier to implement.

Can you try:

extraAttributes:
  fromLabels:
    - key: splunk.abc.com/splunk-index
      tag_name: com.splunk.index
      from: pod
    - key: splunk.abc.com/splunk-source
      from: pod
    - key: splunk.abc.com/splunk-sourcetype
      from: pod
agent:
  config:
    processors:
      resource/logs:
        attributes:
        - action: upsert
          from_attribute: k8s.pod.labels.splunk.abc.com/splunk-sourcetype
          key: com.splunk.sourcetype
        - action: upsert
          from_attribute: k8s.pod.labels.splunk.abc.com/splunk-source
          key: com.splunk.source
        - action: delete
          key: k8s.pod.annotations.splunk.com/sourcetype
        - action: delete
          key: k8s.pod.labels.splunk.abc.com/splunk-sourcetype
        - action: delete
          key: k8s.pod.labels.splunk.abc.com/splunk-source
        - action: delete
          key: splunk.com/exclude

Adding these resource/logs upserts worked for me

saiharshitachava commented 1 year ago

doesnt work for me I tried this The requirement is splunk.com annotations work and also custom labels to work for index source and sourcetype

omrozowicz-splunk commented 1 year ago

Can you send your custom values.yaml? The above example is indeed custom label. Did you change:

          - from: pod
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.abc.com/source
            tag_name: com.splunk.source
          - from: pod
            key: splunk.abc.com/sourcetype
            tag_name: com.splunk.sourcetype
          - from: namespace
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index

to

          - from: pod
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.abc.com/source
          - from: pod
            key: splunk.abc.com/sourcetype
          - from: namespace
            key: splunk.abc.com/splunk-index

You need to get rid off tag_name for source and sourcetype, and then upsert k8s.pod.labels.splunk.abc.com/splunk-sourcetype instead of k8s.pod.annotations.splunk.abc.com/splunk-sourcetype (I base this on the previous config you sent)

saiharshitachava commented 1 year ago

The above mentioned config works only with custom labels :)

the advanced config mentioned in this doc breaks if I do this

https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/advanced-configuration.md#managing-log-ingestion-by-using-annotations

Idea is to support annotations in above doc and also my custom labels

omrozowicz-splunk commented 1 year ago

Ok, so if I understand this right (finally 😅 ) you want to use annotations to set up index/source/sourcetype, but whenever pod is labeled with a specific label - overwrite these settings with the label's value?

The only thing you need to achieve it is to add more upserts to the config:

    processors:
      resource/logs:
        attributes:
        - action: upsert
          from_attribute: k8s.pod.annotations.splunk.com/sourcetype
          key: com.splunk.sourcetype
        - action: upsert
          from_attribute: k8s.pod.labels.splunk.abc.com/splunk-sourcetype
          key: com.splunk.sourcetype
        - action: upsert
          from_attribute: k8s.pod.annotations.splunk.com/source
          key: com.splunk.source
        - action: upsert
          from_attribute: k8s.pod.labels.splunk.abc.com/splunk-source
          key: com.splunk.source

And the last upsert wins, so in the case from the above, the custom label will always overwrite the value from annotation. In namespace case it will need to be k8s.namespace.annotations.splunk.com/VALUE instead obviously.

omrozowicz-splunk commented 1 year ago

I see you already have this logic in your config file, the only thing is that k8s.pod.labels.splunk.abc.com/sourcetype is being lost when you do:

          - from: pod
            key: splunk.abc.com/sourcetype
            tag_name: com.splunk.sourcetype

instead of

          - from: pod
            key: splunk.abc.com/sourcetype

also, do you need:

          - from: pod
            key: splunk.abc.com/splunk-index
            tag_name: com.splunk.index
          - from: pod
            key: splunk.abc.com/source
          - from: pod
            key: splunk.abc.com/sourcetype

in annotations?

saiharshitachava commented 1 year ago

The use case seems to be working now Thanks @omrozowicz-splunk for your timely help on this issue..Much appreciated

atoulme commented 1 year ago

Closed as complete. Thanks.