vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.85k stars 1.58k forks source link

kafka source memory usage on topic with large lag #20939

Open bobelev opened 3 months ago

bobelev commented 3 months ago

A note for the community

Problem

When vector reads topic from the beginning, it consumes a lot of memory.

Part of the problem is librdkafka's not very sane defaults. It just tries to consume as much data as possible.

This can be mitigated with

librdkafka_options:
  queued.min.messages: "100"
  fetch.message.max.bytes: "52400"

Another strange thing is that even with acknowledgements.enabled=true and a disk buffer configured to use only 256MB and to block events, kafka source still continues to read messages. I believed from the docs that vector should wait for an empty space in the buffer and only then try to read new events.

Heaptrack dump heaptrack.vector.2818638.zst.zip

If this behaviour is normal than maybe vector should provide sane defaults for small deployments (especially in k8s with strict limits).

A bit off topic. I couldn't get vector to run under valgrind. Not for the repo version and not for the latest master.

Ubuntu 22.04.4 LTS Linux demo-01a 6.5.0-1020-aws #20~22.04.1-Ubuntu SMP Wed May 1 16:10:50 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux ``` ==2820656== Memcheck, a memory error detector ==2820656== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al. ==2820656== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info ==2820656== Command: vector -C /root/vector/one/ ==2820656== ==2820656== Invalid read of size 4 ==2820656== at 0x4CF4D14: _rjem_je_tcache_bin_flush_small (in /usr/bin/vector) ==2820656== by 0x4CB4212: _rjem_je_sdallocx_default (in /usr/bin/vector) ==2820656== by 0x3EC638F: core::ptr::drop_in_place> (in /usr/bin/vector) ==2820656== by 0x3EC6200: core::ptr::drop_in_place (in /usr/bin/vector) ==2820656== by 0x3EC4111: vector::cli::Opts::get_matches (in /usr/bin/vector) ==2820656== by 0x7F24DD: vector::main (in /usr/bin/vector) ==2820656== by 0x7F2092: std::sys_common::backtrace::__rust_begin_short_backtrace (in /usr/bin/vector) ==2820656== by 0x7F4AEE: main (in /usr/bin/vector) ==2820656== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==2820656== ==2820656== ==2820656== Process terminating with default action of signal 11 (SIGSEGV) ==2820656== Access not within mapped region at address 0x0 ==2820656== at 0x4CF4D14: _rjem_je_tcache_bin_flush_small (in /usr/bin/vector) ==2820656== by 0x4CB4212: _rjem_je_sdallocx_default (in /usr/bin/vector) ==2820656== by 0x3EC638F: core::ptr::drop_in_place> (in /usr/bin/vector) ==2820656== by 0x3EC6200: core::ptr::drop_in_place (in /usr/bin/vector) ==2820656== by 0x3EC4111: vector::cli::Opts::get_matches (in /usr/bin/vector) ==2820656== by 0x7F24DD: vector::main (in /usr/bin/vector) ==2820656== by 0x7F2092: std::sys_common::backtrace::__rust_begin_short_backtrace (in /usr/bin/vector) ==2820656== by 0x7F4AEE: main (in /usr/bin/vector) ==2820656== If you believe this happened as a result of a stack ==2820656== overflow in your program's main thread (unlikely but ==2820656== possible), you can try to increase the size of the ==2820656== main thread stack using the --main-stacksize= flag. ==2820656== The main thread stack size used in this run was 8388608. ==2820656== ==2820656== HEAP SUMMARY: ==2820656== in use at exit: 166,111 bytes in 505 blocks ==2820656== total heap usage: 512 allocs, 7 frees, 168,183 bytes allocated ==2820656== ==2820656== LEAK SUMMARY: ==2820656== definitely lost: 0 bytes in 0 blocks ==2820656== indirectly lost: 0 bytes in 0 blocks ==2820656== possibly lost: 0 bytes in 0 blocks ==2820656== still reachable: 166,111 bytes in 505 blocks ==2820656== suppressed: 0 bytes in 0 blocks ==2820656== Rerun with --leak-check=full to see details of leaked memory ==2820656== ==2820656== For lists of detected and suppressed errors, rerun with: -s ==2820656== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 0 from 0) Segmentation fault (core dumped) ``` and massif ``` ==2820659== Massif, a heap profiler ==2820659== Copyright (C) 2003-2024, and GNU GPL'd, by Nicholas Nethercote et al. ==2820659== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info ==2820659== Command: vector -C /root/vector/one ==2820659== ==2820659== ==2820659== Process terminating with default action of signal 11 (SIGSEGV) ==2820659== Access not within mapped region at address 0x0 ==2820659== at 0x4CF4CBD: _rjem_je_tcache_bin_flush_small (in /usr/bin/vector) ==2820659== by 0x4CB4212: _rjem_je_sdallocx_default (in /usr/bin/vector) ==2820659== by 0x3EC638F: core::ptr::drop_in_place> (in /usr/bin/vector) ==2820659== by 0x3EC6200: core::ptr::drop_in_place (in /usr/bin/vector) ==2820659== by 0x3EC4111: vector::cli::Opts::get_matches (in /usr/bin/vector) ==2820659== by 0x7F24DD: vector::main (in /usr/bin/vector) ==2820659== by 0x7F2092: std::sys_common::backtrace::__rust_begin_short_backtrace (in /usr/bin/vector) ==2820659== by 0x7F4AEE: main (in /usr/bin/vector) ==2820659== If you believe this happened as a result of a stack ==2820659== overflow in your program's main thread (unlikely but ==2820659== possible), you can try to increase the size of the ==2820659== main thread stack using the --main-stacksize= flag. ==2820659== The main thread stack size used in this run was 8388608. ==2820659== Segmentation fault (core dumped) ```

Configuration

No response

Version

vector 0.39.0 (x86_64-unknown-linux-gnu 73da9bb 2024-06-17 16:00:23.791735272)

Debug Output

No response

Example Data

No response

Additional Context

No response

References

https://github.com/vectordotdev/vector/issues/20553

jszwedko commented 3 months ago

I believed from the docs that vector should wait for an empty space in the buffer and only then try to read new events.

This should be true. It seems possible that it is librdkafka that is fetching and buffering events before Vector sees them.

Thanks for this report though. It does seem like we may want to use different librdkafka defaults.

psychonaut commented 1 month ago

I can confirm that behavior. We have a fairly big cluster and I am trying to use vector instead of logstash. With a lag 200-300 mln of events on two topics 8 GB of RAM is not enough and the vector is crashing with OOM. With queued.min.messages = 10000 we have RAM usage under 2G.

timfehr commented 6 days ago

I can also confirm that behavior and we have a quite similar use case as @psychonaut. Our input lag was ~100Mio when vector was OOMKilled by kubernetes. Haven't tried the settings mentioned here. When we do our next load tests we will test thew new configuration.