Open ElementTech opened 1 day ago
Hi @ElementTech ! I think this is the case, but do I take it to mean that you tried increasing batch.max_bytes
and batch.max_events
but saw no difference?
Note that batch.max_bytes
may not always map to object size because the way that event sizes are calculated can differ from batch serialized size (see https://github.com/vectordotdev/vector/issues/10020).
Hey @jszwedko, yes, I've played around with those values both up down. Just for sake of testing as you can see I've put all numbers extremely high but still no difference in behavior.
I should also note that each of those .json
files has around ~3500 events (single level dictionaries), but not exactly 3500, it can deviate in a few hundreds. I can assume that whatever it is that decides to save it stops at a certain size as opposed to a certain event count. Also of course not all events are evenly sized.
I might be wrong but I'm also using disk
instead of memory
when collecting the events, and even if it was memory, should this resulting batch size still be this comparatively small?
Thanks!
Problem
I have Vector installed in Kubernetes in AWS. I am using SQS as a source and S3 as a sink. No matter how high I set batching and buffer parameters, at max-load of event ingestion, my s3 bucket receives them at exactly 2.4 MB batches. When an event spike ends, it releases the rest of the events in smaller files until finished.
Configuration
Version
0.42.0-distroless-libc
Debug Output
No response
Example Data
No response
Additional Context
I have two environments. The only difference between them is the _batch.timeoutsecs parameter. In my dev environment, it is set to 60, and in my production to 1800. The exact same issue (2.4 MB sized files) happens in both.
References
No response