open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

Add dockerattributesprocessor to enrich docker logs collected with filelog receiver #8982

Closed matthewmodestino closed 1 year ago

matthewmodestino commented 2 years ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

As an OTel native logging user, I would like to collect docker logs and have them enriched the same way we do for k8s logs

Describe the solution you'd like A clear and concise description of what you want to happen.

I would like docker logs to be enriched in the OTel logging pipeline with docker metadata from the docker api, including but not limited to:

  "docker": {
    "id": "df14e0d5ae4c07284fa636d739c8fc2e6b52bc344658de7d3f08c36a2e804115",
    "name": "k8s_fabric8-console-container.efbd6e64_fabric8-console-controller-9knhj_default_8ae2f621-f360-11e4-8d12-54ee7527188d_7ec9aa3e",
    "container_hostname": "fabric8-console-controller-9knhj",
    "image": "fabric8/hawtio-kubernetes:latest",
    "image_id": "b2bd1a24a68356b2f30128e6e28e672c1ef92df0d9ec01ec0c7faea5d77d2303",
    "labels": {}
  }

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Today the community relies on fluentd/fluent-bit community plugins, which are not widely maintained, or docker logging plugins. Users generally like to avoid needing to touch the docker engine directly, and logging plugins can cause adverse impact to the docker daemon. Picking up the files from disk provides a reliable buffer and is easy for users to adopt.

Additional context Add any other context or screenshots about the feature request here.

While k8s is moving away from docker, we still see docker based deployments a lot in the community and it is a common request for adopting more use of OTel log collection.

This solution should/could also possibly be flexible or generic enough to collect/enrich from containerd/cri-o sockets. Maybe calling it "containerattributeprocessor" if able to make it that flexible.

matthewmodestino commented 2 years ago

I may have found a way forward by using the metadata that the docker json-file driver will embed, then just using existing filelog operators/processors to add the docker meta to resource / attributes. Will try it now...

https://docs.docker.com/config/containers/logging/json-file/#options

djaglowski commented 2 years ago

Did it work?

matthewmodestino commented 2 years ago

yep!

I used the docker.json to set the tags the way I wanted them (pipe separated for easy regex), then did something like this:

receivers:
  filelog:
    include:
    - /var/lib/docker/containers/*/*-json.log
    encoding: utf-8
    fingerprint_size: 1kb
    force_flush_period: "0"
    include_file_name: false
    include_file_path: true
    max_concurrent_files: 1024
    max_log_size: 1MiB
    operators:
    - id: parser-docker
      timestamp:
        layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        parse_from: time
      type: json_parser
    - id: filename
      resource:
        com.splunk.source: EXPR($$attributes["file.path"])
      type: metadata
    - id: extract_metadata_from_docker_tag
      parse_from: $$.attrs.tag
      regex: ^(?P<name>[^\|]+)\|(?P<image_name>[^\|]+)\|(?P<id>[^$]+)$
      type: regex_parser
    - attributes:
        log.iostream: EXPR($$.stream)
      resource:
        com.splunk.sourcetype: EXPR("docker:container:"+$$.name)
        docker.container.name: EXPR($$.name)
        docker.image.name: EXPR($$.image_name)
        docker.container.id: EXPR($$.id)
      type: metadata
    - id: clean-up-log-record
      ops:
      - move:
          from: $$.log
          to: $$
      type: restructure
    poll_interval: 200ms
    start_at: beginning

exporters:
  splunk_hec/logs:
    # Splunk HTTP Event Collector token.
    token: "00000000-0000-0000-0000-000000000000"
    # URL to a Splunk instance to send data to.
    endpoint: "https://foo.splunkit.io:8088/services/collector"
    # Optional Splunk source: https://docs.splunk.com/Splexicon:Source
    source: "docker"
    # Splunk index, optional name of the Splunk index targeted.
    index: "main"
    # Maximum HTTP connections to use simultaneously when sending data. Defaults to 100.
    max_connections: 20
    # Whether to disable gzip compression over HTTP. Defaults to false.
    disable_compression: true 
    # HTTP timeout when sending data. Defaults to 10s.
    timeout: 10s
    # Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false.
    # For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true.
    tls:
      insecure_skip_verify: true
   # Debug
  #logging:
  #  loglevel: debug

processors:
  batch:
  resourcedetection:
    detectors:
    - env
    - system
    timeout: 10s
    override: true

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  zpages:
    endpoint: :55679

service:
  extensions: 
  - zpages
  - health_check
  pipelines:
    logs:
      receivers:
      - filelog
      processors:
      - batch
      - resourcedetection
      exporters:
      - splunk_hec/logs
      #- logging

and deployed the docker image like this:

version: "3"
services:
  otelcollector:
    image: "quay.io/signalfx/splunk-otel-collector:latest"
    user: "root"
    privileged: true
    container_name: otelcollector
    command: ["--config=/etc/otel-collector-config.yml", "--set=service.telemetry.logs.level=debug"]
    volumes:
      - ./otel-collector-config.yml:/etc/otel-collector-config.yml
      - /var/lib/docker/containers:/var/lib/docker/containers:ro

keep in mind this was a while back and there were some breaking changes in the syntax for the filelog operators so my use of $$ and the EXPR syntax needs slight updates accordingly...

https://github.com/open-telemetry/opentelemetry-log-collection/blob/v0.29.0/CHANGELOG.md#upgrading-to-v0290

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

fatsheep9146 commented 1 year ago

I think the problem is solved. The issue could be closed.

matthewmodestino commented 1 year ago

technically its not solved, I had to use a very cumbersome workaround to get the metadata that included customizing the docker daemon, which is not very user friendly....

Still think we should be able to enrich straight docker logs like we do in k8s

I understand docker is not long for this world, but we still see straight docker environments and likely will for some time...