Too many small chunks file

vsxen commented 1 year ago

use complete-local-config.yaml config file ,there is too many small file in /tmp/loki/chunks/ZmFrZS.xxxxx dir. can I set default chunk file size?

auth_enabled: false

server:
  http_listen_port: 3100

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 5m
  chunk_retain_period: 30s

schema_config:
  configs:
  - from: 2020-05-15
    store: boltdb
    object_store: filesystem
    schema: v11
    index:
      prefix: index_
      period: 168h

storage_config:
  boltdb:
    directory: /tmp/loki/index

  filesystem:
    directory: /tmp/loki/chunks

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h

MichelHollands commented 1 year ago

The chunk_idle_period controls how long log chunks are stored in the memory of the ingester before being flushed to disk. The default value is 30 minutes, 1 hour can work as well. Additionally chunk_target_size can be increased from it's default value.

vsxen commented 1 year ago

I've tried the chunk_idle_period parameter before, it seems to don't work.

The size of /tmp/loki/chunks/ZmFrZS.xxxxx files are all 4K, but the chunk_target_size parameter defaults to 1.5MB

nvollmar commented 1 year ago

I have similar issues. Though most of my chunk files are between 700 - 900 bytes.

          ingester:
            max_chunk_age: 2h
            chunk_idle_period: 2h
            chunk_retain_period: 30s
            chunk_target_size: 1572864
            flush_check_period: 1m
            lifecycler:
              ring:
                kvstore:
                  store: inmemory
                replication_factor: 1
            wal:
              dir: /data/loki/wal

Update In my case those small files were caused by too many chunk streams (I created labels with many distinct values causing this)

barddes commented 10 months ago

I'm experiencing a similar issue, but I'm still struggling to understand why it's happening. My goal is to have chunks with a size of around 1.5MB. However, when I check the Minio console, I can see that some files are being uploaded with sizes around 1KB to 100KB, and many others have sizes less than 1KB.

Here's my current configuration:

ingester:
  lifecycler: {...}
  concurrent_flushes: 32
  flush_check_period: 30s
  flush_op_timeout: 10m0s
  chunk_retain_period: 6m0s
  chunk_idle_period: 2h0m0s
  chunk_block_size: 536870912    # 512MB
  chunk_target_size: 1572864     # 1.5MB
  chunk_encoding: snappy
  max_chunk_age: 3h0m0s
  autoforget_unhealthy: false
  sync_period: 0s
  sync_min_utilization: 0
  max_returned_stream_errors: 10
  query_store_max_look_back_period: 0s
  wal: {...}
  index_shards: 32
  max_dropped_streams: 10
  shutdown_marker_path: /var/loki

What I can't understand is that when I change the chunk_target_size from 1.5MB to 15MB, the chunk size increases from a few bytes to approximately 1.5MB, sometimes less (~800KB), sometimes more (~2MB), and there are still a few chunks being uploaded with just a few bytes.

I suspect the root cause might be related to my labels, as they are not all static. But with this configuration, shouldn't it wait for all streams to grow until they reach around 1.5MB (compressed) or 512MB before flushing? Is there a way to log why my stream is being flushed?

grafana / loki

Too many small chunks file #7851