Open AndrewChubatiuk opened 1 year ago
We're also experiencing constant increases in memory consumption in production workloads using AWS cloudwatch sink.
Any news on this one folks? Thanks
Cc @jszwedko
Unfortunately not yet. If someone else was interested in pushing this forward, I think the next step would be to run Vector under a memory profiler like valgrind to see if a leak can be identified.
Ok, in our case seems like adding the expire_metrics_secs
global option has helped to mitigate the increase in memory consumed.
Thanks for the update though!
@AndrewChubatiuk a couple of questions:
vector_component_allocated_bytes
metric?expire_metric_secs
?
- increase reported by metric
- yes
Can you verify that you are actually seeing RSS usage increase? That metric is experimental so I wouldn't be surprised if it was inaccurate.
One suspicion is that this memory growth in the sink can be attributed to the fact that the sink creates one client per (group, stream)
pair (code ref) and that client remains in memory for the remaining lifetime of the sink.
In this case, the stream_name
is the kubernetes pod name, which is an unbounded set over time, meaning that the number of clients will continue to increase over time.
https://github.com/vectordotdev/vector/issues/19345 may also be relevant here.
A note for the community
No response
Problem
Using vector helm chart, app version 0.29.0 on AWS EKS And on a component memory allocation graph you can see that only cloudwatch logs component memory allocation is constantly growing
Configuration
Version
0.29.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response