S3 sink batch size stuck at 2.4 MB sized files

ElementTech commented 2 weeks ago

Problem

I have Vector installed in Kubernetes in AWS. I am using SQS as a source and S3 as a sink. No matter how high I set batching and buffer parameters, at max-load of event ingestion, my s3 bucket receives them at exactly 2.4 MB batches. When an event spike ends, it releases the rest of the events in smaller files until finished.

Screenshot 2024-11-04 at 15 35 56

Configuration

api:
  address: 127.0.0.1:8686
  enabled: true
  playground: true
data_dir: /var/lib/vector
expire_metrics_secs: 60
log_schema:
  timestamp_key: date
sinks:
  dropped:
    acknowledgements:
      enabled: true
    batch:
      max_bytes: 10000000
      max_events: 10000000
      timeout_secs: 60
    bucket: xxx-vector-xxx-s3
    buffer:
      max_size: 1073741824
      type: disk
    compression: none
    encoding:
      codec: json
      timestamp_format: unix_ms
    filename_append_uuid: true
    filename_extension: json
    filename_time_format: '%s'
    framing:
      method: newline_delimited
    healthcheck:
      enabled: false
    inputs:
    - remap_sources.dropped
    key_prefix: vector_dropped/year=%Y/month=%m/day=%d/
    region: us-east-1
    server_side_encryption: AES256
    storage_class: ONEZONE_IA
    type: aws_s3
  prometheus_exporter:
    address: 0.0.0.0:9598
    default_namespace: service
    inputs:
    - vector_metrics
    type: prometheus_exporter
  s3_export:
    acknowledgements:
      enabled: true
    batch:
      max_bytes: 10000000
      max_events: 10000000
      timeout_secs: 60
    bucket: dev-vector-analytics-s3
    buffer:
      max_size: 1073741824
      type: disk
    compression: none
    encoding:
      codec: json
      timestamp_format: unix_ms
    filename_append_uuid: true
    filename_extension: json
    filename_time_format: '%s'
    framing:
      method: newline_delimited
    healthcheck:
      enabled: false
    inputs:
    - route_events.passed_events
    key_prefix: playable_events_valid/year=%Y/month=%m/day=%d/
    region: us-east-1
    server_side_encryption: AES256
    storage_class: STANDARD
    type: aws_s3
  s3_export_failed:
    acknowledgements:
      enabled: true
    batch:
      max_bytes: 10000000
      max_events: 10000000
      timeout_secs: 60
    bucket: xxx-vector-xxx-s3
    buffer:
      max_size: 1073741824
      type: disk
    compression: none
    encoding:
      codec: json
      timestamp_format: unix_ms
    filename_append_uuid: true
    filename_extension: json
    filename_time_format: '%s'
    framing:
      method: newline_delimited
    healthcheck:
      enabled: false
    inputs:
    - route_events.failed_events
    key_prefix: playable_events_failed/year=%Y/month=%m/day=%d/
    region: us-east-1
    server_side_encryption: AES256
    storage_class: STANDARD_IA
    type: aws_s3
sources:
  dlq_data:
    queue_url: https://sqs.us-east-1.amazonaws.com/xxxx/dev-analytics-ingestion-dlq
    region: us-east-1
    type: aws_sqs
  offline_data:
    acknowledgements:
      enabled: true
    framing:
      method: newline_delimited
    region: us-east-1
    sqs:
      queue_url: https://sqs.us-east-1.amazonaws.com/xxxx/dev-vector-s3-offline-source
    type: aws_s3
  realtime_data:
    queue_url: https://sqs.us-east-1.amazonaws.com/xxxx/dev-vector-realtime-source
    region: us-east-1
    type: aws_sqs
  vector_logs:
    type: internal_logs
  vector_metrics:
    type: internal_metrics
transforms:
  add_date:
    inputs:
    - ignore_heartbeat
    source: .date = from_unix_timestamp!(.timestamp, "milliseconds")
    type: remap
  ignore_heartbeat:
    condition:
      source: .event_type != "heartbeat"
      type: vrl
    inputs:
    - remap_sources
    type: filter
  logs:
    condition:
      source: '!includes(["INFO", "DEBUG"], .metadata.level)'
      type: vrl
    inputs:
    - vector_logs
    type: filter
  remap_sources:
    drop_on_abort: true
    drop_on_error: true
    inputs:
    - offline_data
    - realtime_data
    - dlq_data
    reroute_dropped: true
    source: . = parse_json!(.message)
    type: remap
  route_events:
    inputs:
    - add_date
    reroute_unmatched: false
    route:
      failed_events: .status != 200
      passed_events: .status == 200
    type: route

Version

0.42.0-distroless-libc

Debug Output

No response

Example Data

No response

Additional Context

I have two environments. The only difference between them is the _batch.timeoutsecs parameter. In my dev environment, it is set to 60, and in my production to 1800. The exact same issue (2.4 MB sized files) happens in both.

References

No response

jszwedko commented 2 weeks ago

Hi @ElementTech ! I think this is the case, but do I take it to mean that you tried increasing batch.max_bytes and batch.max_events but saw no difference?

Note that batch.max_bytes may not always map to object size because the way that event sizes are calculated can differ from batch serialized size (see https://github.com/vectordotdev/vector/issues/10020).

ElementTech commented 2 weeks ago

Hey @jszwedko, yes, I've played around with those values both up down. Just for sake of testing as you can see I've put all numbers extremely high but still no difference in behavior.

I should also note that each of those .json files has around ~3500 events (single level dictionaries), but not exactly 3500, it can deviate in a few hundreds. I can assume that whatever it is that decides to save it stops at a certain size as opposed to a certain event count. Also of course not all events are evenly sized.

I might be wrong but I'm also using disk instead of memory when collecting the events, and even if it was memory, should this resulting batch size still be this comparatively small?

Thanks!

jszwedko commented 1 week ago

Apologies for the delayed response!

Hey @jszwedko, yes, I've played around with those values both up down. Just for sake of testing as you can see I've put all numbers extremely high but still no difference in behavior.

Gotcha, that is interesting.

I should also note that each of those .json files has around ~3500 events (single level dictionaries), but not exactly 3500, it can deviate in a few hundreds. I can assume that whatever it is that decides to save it stops at a certain size as opposed to a certain event count. Also of course not all events are evenly sized.

One shot in the dark, can you try setting filename_time_format to ""? I believe that suffix is only added when writing the batch, but I could be wrong and it is actually involved in the partitioning of events such that each object may represent roughly one-second worth of events.

I might be wrong but I'm also using disk instead of memory when collecting the events, and even if it was memory, should this resulting batch size still be this comparatively small?

In Vector's architecture, buffers appear in front of sinks and so, from the sink perspective, it makes no difference if the fronting buffer is memory or disk. It is transparent to the sink.

vectordotdev / vector