Open bobelev opened 3 months ago
I believed from the docs that vector should wait for an empty space in the buffer and only then try to read new events.
This should be true. It seems possible that it is librdkafka that is fetching and buffering events before Vector sees them.
Thanks for this report though. It does seem like we may want to use different librdkafka defaults.
I can confirm that behavior. We have a fairly big cluster and I am trying to use vector instead of logstash. With a lag 200-300 mln of events on two topics 8 GB of RAM is not enough and the vector is crashing with OOM. With queued.min.messages = 10000 we have RAM usage under 2G.
I can also confirm that behavior and we have a quite similar use case as @psychonaut. Our input lag was ~100Mio when vector was OOMKilled by kubernetes. Haven't tried the settings mentioned here. When we do our next load tests we will test thew new configuration.
A note for the community
Problem
When vector reads topic from the beginning, it consumes a lot of memory.
Part of the problem is librdkafka's not very sane defaults. It just tries to consume as much data as possible.
This can be mitigated with
Another strange thing is that even with
acknowledgements.enabled=true
and a disk buffer configured to use only 256MB and to block events, kafka source still continues to read messages. I believed from the docs that vector should wait for an empty space in the buffer and only then try to read new events.Heaptrack dump heaptrack.vector.2818638.zst.zip
If this behaviour is normal than maybe vector should provide sane defaults for small deployments (especially in k8s with strict limits).
A bit off topic. I couldn't get vector to run under valgrind. Not for the repo version and not for the latest master.
Configuration
No response
Version
vector 0.39.0 (x86_64-unknown-linux-gnu 73da9bb 2024-06-17 16:00:23.791735272)
Debug Output
No response
Example Data
No response
Additional Context
No response
References
https://github.com/vectordotdev/vector/issues/20553