Open WilliamEricCheung opened 1 year ago
Since no one comments on my post, we tried to modify our scripts to just keep almost 100 logs within 4 hours. Now the vector can run smoothly, but it's not the solution we want for our business.
I see somewhat similar behavior with simple file -> s3 pipeline as well. Anybody from vector have advice here?
Since no one comments on my post, we tried to modify our scripts to just keep almost 100 logs within 4 hours. Now the vector can run smoothly, but it's not the solution we want for our business.
Hi @WilliamEricCheung, and my apologies for not providing a reply sooner. Since this seems like a complex issue we will need to dive deeper to determine if there is actual bottleneck (bug that can be fixed) or this is a case where scaling up is the solution.
As an easy first step, I suggest adding an internal metrics source and sharing the metrics here.
A note for the community
Problem
Background Hello, we are migrating services logs using vector (1 host) -- kafka (3 hosts) -- logstash (3 hosts) -- opensearch (1 kibana host) pipeline. Our services will output almost 25 gzip files per hour in certain directory, each size is about 900M, and we have a script to remove 1 day ago deprecated logs from this directory to reduce storage pressure. When we start vector at Nov 16, 2022 @ 9:00:00, the throughput was steady until 12:00:00, during this 3 hour, we could get about 450M indice hits per hour in kibana. However, the hits rate rapidly down after 12 PM, we just got 100M at 3 PM, 80M at 6 PM, 70M at 9 PM, 60 M at Nov 17, 2022 0 AM, 50M at 3 AM and 40M at 6 AM.
Our thinking Every hour our services put 25 files, for one whole day, we will have 2425=600 files. Even we use a script to keep just one day logs, the file total size is very heavy here - 600 900M = 540G.
Question
Basic Infomation Sources format: gzip files Files directory: /local/vectorLog Files format: /local/vectorLog/requests.log.yyyy-MM-dd-HH.gz Files size: ~900M
Configuration
Version
vector 0.23.3 (x86_64-unknown-linux-gnu af8c9e1 2022-08-10)
Debug Output
Example Data
No response
Additional Context
No response
References
No response