open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.04k stars 2.35k forks source link

Compaction on rebound occuring before needed threshold is reached #14701

Closed jeremyaouad closed 1 year ago

jeremyaouad commented 2 years ago

Description

I have code emitting metrics and logs to a collector. The collector then emits to a visualizer. I'm doing a test where I kill -STOP the visualizer for 10 minutes, and switch it back on, in order to test the compaction on rebound. Only after the visualizer comes back online, I expect to have compaction, and not before. I configured compaction as follows:

extensions:
  file_storage:
    directory: ${WORK_DIR}
    compaction:
      on_start: true
      on_rebound: true
      directory: ${WORK_DIR}
      max_transaction_size: 0
      rebound_needed_threshold_mib: 4000
      rebound_trigger_threshold_mib: 500
      check_interval: 60s
service:
  extensions: [file_storage]
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch/metrics]
      exporters: [otlp/mocked_metrics_viz] # kill -STOP for 10 min, then kill -CONT
    logs:
      receivers: [otlp]
      processors: []
      exporters: [otlp/mocked_logs_viz] # kill -STOP for 10 min, then kill -CONT
    metrics/splunk:
      receivers: [otlp]
      processors: [batch/metrics]
      exporters: [splunk_hec/metrics] # Never stopped, so that we can get the below graphs

Since bbolt mmaps disk to memory, I am monitoring the memory of the collector process and cpu.

Actual Result

Compaction on rebound: compaction_on_rebound

Same scenario without compaction: We can see that memory goes up above 4 GB after the collector resumes sending to the continued visualizers. This is where I expected to see compaction on rebound happening. no_compaction

Collector version

0.54.0

Environment information

Environment

OS: Red Hat Enterprise Linux release 8.5 (Ootpa) Compiled on: rhel8 - gcc8 cpp17

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

github-actions[bot] commented 2 years ago

Pinging code owners: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 1 year ago

This issue has been closed as inactive because it has been stale for 120 days with no activity.