open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.53k stars 1.48k forks source link

Batching settings not respected #11487

Open nishaprabhakar opened 1 month ago

nishaprabhakar commented 1 month ago

Describe the bug I have deployed opentelemetry-collector with the following settings:

batch/host:
     send_batch_max_size: 4096
     send_batch_size: 2048
     timeout: 15s

This is the official description of these parameters in the documentation here:

However, looking at the output files from the pipeline that uses this processor, I'm seeing a few anomalous things:

Steps to reproduce Create a high volume pipeline that passes through a batch processor with the specifications above.

What did you expect to see?

  1. All files should have <= 4096 records since that is the maximum size of batches.
  2. Most files should be much larger than 80 records/file because of the high values of timeout and send_batch_size.

What did you see instead?

  1. A low volume of extremely large files, with > 4096 records.
  2. Most files only contain 80 records, and are much smaller than send_batch_size.

What version did you use? v0.104.0

What config did you use?

apiVersion: v1
data:
  relay: |
    exporters:
      awss3/host:
        s3uploader:
          region: eu-west-1
          s3_bucket: [BUCKET NAME]
          s3_partition: minute
          s3_prefix: host-logs
      debug: {}
      logging:
        loglevel: debug
    extensions:
      health_check:
        endpoint: ${env:MY_POD_IP}:13133
    processors:
      batch/host:
        send_batch_max_size: 4096
        send_batch_size: 2048
        timeout: 15s
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 25
    receivers:
      filelog/host:
        force_flush_period: 0
        include:
        - /var/log/messages
        - /var/log/dmesg
        - /var/log/audit/audit.log
        include_file_name: false
        include_file_path: true
        retry_on_failure:
          enabled: true
        start_at: beginning
    service:
      extensions:
      - health_check
      pipelines:
        logs/host:
          exporters:
          - awss3/host
          processors:
          - batch/host
          - memory_limiter
          receivers:
          - filelog/host
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/instance: opentelemetry-collector
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: opentelemetry-collector
    app.kubernetes.io/version: 0.101.0
    helm.sh/chart: opentelemetry-collector-0.92.0
  name: opentelemetry-collector-agent
  namespace: opentelemetry
VihasMakwana commented 1 month ago

Thanks for reporting this. I'll check this out and get back to you.

nishaprabhakar commented 6 days ago

Thanks so much! Any updates here? @VihasMakwana