Closed matthewmodestino closed 1 year ago
I may have found a way forward by using the metadata that the docker json-file
driver will embed, then just using existing filelog
operators/processors to add the docker meta to resource
/ attributes
. Will try it now...
https://docs.docker.com/config/containers/logging/json-file/#options
Did it work?
yep!
I used the docker.json to set the tags the way I wanted them (pipe separated for easy regex), then did something like this:
receivers:
filelog:
include:
- /var/lib/docker/containers/*/*-json.log
encoding: utf-8
fingerprint_size: 1kb
force_flush_period: "0"
include_file_name: false
include_file_path: true
max_concurrent_files: 1024
max_log_size: 1MiB
operators:
- id: parser-docker
timestamp:
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
parse_from: time
type: json_parser
- id: filename
resource:
com.splunk.source: EXPR($$attributes["file.path"])
type: metadata
- id: extract_metadata_from_docker_tag
parse_from: $$.attrs.tag
regex: ^(?P<name>[^\|]+)\|(?P<image_name>[^\|]+)\|(?P<id>[^$]+)$
type: regex_parser
- attributes:
log.iostream: EXPR($$.stream)
resource:
com.splunk.sourcetype: EXPR("docker:container:"+$$.name)
docker.container.name: EXPR($$.name)
docker.image.name: EXPR($$.image_name)
docker.container.id: EXPR($$.id)
type: metadata
- id: clean-up-log-record
ops:
- move:
from: $$.log
to: $$
type: restructure
poll_interval: 200ms
start_at: beginning
exporters:
splunk_hec/logs:
# Splunk HTTP Event Collector token.
token: "00000000-0000-0000-0000-000000000000"
# URL to a Splunk instance to send data to.
endpoint: "https://foo.splunkit.io:8088/services/collector"
# Optional Splunk source: https://docs.splunk.com/Splexicon:Source
source: "docker"
# Splunk index, optional name of the Splunk index targeted.
index: "main"
# Maximum HTTP connections to use simultaneously when sending data. Defaults to 100.
max_connections: 20
# Whether to disable gzip compression over HTTP. Defaults to false.
disable_compression: true
# HTTP timeout when sending data. Defaults to 10s.
timeout: 10s
# Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false.
# For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true.
tls:
insecure_skip_verify: true
# Debug
#logging:
# loglevel: debug
processors:
batch:
resourcedetection:
detectors:
- env
- system
timeout: 10s
override: true
extensions:
health_check:
endpoint: 0.0.0.0:13133
zpages:
endpoint: :55679
service:
extensions:
- zpages
- health_check
pipelines:
logs:
receivers:
- filelog
processors:
- batch
- resourcedetection
exporters:
- splunk_hec/logs
#- logging
and deployed the docker image like this:
version: "3"
services:
otelcollector:
image: "quay.io/signalfx/splunk-otel-collector:latest"
user: "root"
privileged: true
container_name: otelcollector
command: ["--config=/etc/otel-collector-config.yml", "--set=service.telemetry.logs.level=debug"]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
- /var/lib/docker/containers:/var/lib/docker/containers:ro
keep in mind this was a while back and there were some breaking changes in the syntax for the filelog operators so my use of $$
and the EXPR
syntax needs slight updates accordingly...
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
I think the problem is solved. The issue could be closed.
technically its not solved, I had to use a very cumbersome workaround to get the metadata that included customizing the docker daemon, which is not very user friendly....
Still think we should be able to enrich straight docker logs like we do in k8s
I understand docker is not long for this world, but we still see straight docker environments and likely will for some time...
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
As an OTel native logging user, I would like to collect docker logs and have them enriched the same way we do for k8s logs
Describe the solution you'd like A clear and concise description of what you want to happen.
I would like docker logs to be enriched in the OTel logging pipeline with docker metadata from the docker api, including but not limited to:
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Today the community relies on fluentd/fluent-bit community plugins, which are not widely maintained, or docker logging plugins. Users generally like to avoid needing to touch the docker engine directly, and logging plugins can cause adverse impact to the docker daemon. Picking up the files from disk provides a reliable buffer and is easy for users to adopt.
Additional context Add any other context or screenshots about the feature request here.
While k8s is moving away from docker, we still see docker based deployments a lot in the community and it is a common request for adopting more use of OTel log collection.
This solution should/could also possibly be flexible or generic enough to collect/enrich from containerd/cri-o sockets. Maybe calling it "containerattributeprocessor" if able to make it that flexible.