open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.9k stars 2.27k forks source link

Rotation causing null characters at the beginning of most files #34832

Open zekai-rai opened 3 weeks ago

zekai-rai commented 3 weeks ago

Component(s)

exporter/file, receiver/otlpjsonfile

What happened?

Description

We have a special multiple-part system that requires us to export telemetry into the file system and get picked up again with another pipeline later. We are using the File exporter to export logs in one pipeline, and another pipeline is using filelog receiver to read it.

(This description removes some irrelevant setup and might not truly reflect what we are doing.)

We noticed that with rotation turned on in file exporter, some hidden null characters (\u0000) are added to the beginning of the file. We are unsure if it's an issue in the receiver or the exporter.

On a related note, we observed a similar behavior in another pipeline where logs are rotated by LogRotate out of the Otel collector pipeline. In this case, we understand that the issue was caused by the fact that file are being actively written to by our process while logrotate is truncating it. However, we do not understand why it's an issue with filelog receiver and file exporter setup.

Steps to Reproduce

Pipeline 1: configuring file exporter to export with rotation. Pipeline 2 run elsewhere: configuring file log received to read from the rotated files. Check the content of pipeline 2.

Expected Result

Normal logs as being originally produced.

Actual Result

\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000{"content":"my valid json. just for demo, not real structure."} {"content":"my valid json. just for demo, not real structure."} {"content":"my valid json. just for demo, not real structure."} {"content":"my valid json. just for demo, not real structure."}

Collector version

0.98.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

pipeline 1:

exporters:
 file: 
    path: /mount/shared-stage/clientlogs.json
    rotation:
      max_megabytes: 1 
      max_backups: 10000
      max_days: 30
      localtime: true

Pipeline 2:

receivers:
  filelog/unsafe: #reading the unsafe logs from stdout/stderr
      include: [/mount/shared-stage/clientlogs.json]
      # delete_after_read: true
      # start_at: beginning
      operators:
        - type: json_parser
          parse_to: body
        - type: time_parser
          parse_from: body.timestamp
          layout: '%Y-%m-%dT%H:%M:%S.%L'

Log output

No response

Additional context

No response

github-actions[bot] commented 3 weeks ago

Pinging code owners:

zekai-rai commented 3 weeks ago

/label receiver/filelog help-wanted -receiver/otlpjsonfile

github-actions[bot] commented 3 weeks ago

Pinging code owners for receiver/filelog: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.