Closed saraik closed 3 years ago
Some quick math tells me that you are indexing around 8400 flow records per second. That is about the most that can be realistically expected from a single node, especially if all of the components (ES, Logstash and Kibana) are on that node. It would also leave very little resources free for things like queries.
Given the load on the node I would also assume that you are dropping a lot of UDP packets.
Can you share more details about the hardware you are using. Are those 24 real cores or 12 cores with hyperthreading? How much memory? HDD or SSD? RAID?
At the rates you are seeing you need to be considering a multi-node cluster. Exactly how large that cluster should be will depend on both the peak and average flows per second you need to handle.
Hi,
We use VM on ESXi, 24 cpu cores, 128GB RAM, 2.5 TB disk. It looks like the CPU works pretty hard; Load avarage is about 21-23 when no one uses it, and when I run a query the load goes up to 26. Another thing I can see is that only 45 GB of the RAM (128) are being used. As I recall this is SSD, and I think we use some kind of RAID. [cid:image001.jpg@01D5CC86.943912F0] - This is htop output :(
What is your best practice recommendations for system sizing and loads ?
Many thanks, Sarai
From: Rob Cowart [mailto:notifications@github.com] Sent: Wednesday, January 15, 2020 7:40 PM To: robcowart/elastiflow elastiflow@noreply.github.com Cc: שרי קוזניוק - Sarai Kozenyuk SaraiK@bezeqint.co.il; Author author@noreply.github.com Subject: Re: [robcowart/elastiflow] Improve performance - multi-indices? cluster? (#479)
Some quick math tells me that you are indexing around 8400 flow records per second. That is about the most that can be realistically expected from a single node, especially if all of the components (ES, Logstash and Kibana) are on that node. It would also leave very little resources free for things like queries.
Given the load on the node I would also assume that you are dropping a lot of UDP packets.
Can you share more details about the hardware you are using. Are those 24 real cores or 12 cores with hyperthreading? How much memory? HDD or SSD? RAID?
At the rates you are seeing you need to be considering a multi-node cluster. Exactly how large that cluster should be will depend on both the peak and average flows per second you need to handle.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/robcowart/elastiflow/issues/479?email_source=notifications&email_token=ALMZ76KHFEMGKPNMDYUIGODQ55C55A5CNFSM4KHHHAOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJBFDIQ#issuecomment-574771618, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALMZ76K24F3YRKZTOYJLSYDQ55C55ANCNFSM4KHHHAOA.
I've been thinking about splitting my elastiflow index as well.
I'm thinking about changing the logstash output to send data to separate indices using a variable from a dictionary. It should be straightforward. I'm thinking about having 3 types for Cisco, F5 and VPN.
Then I'll create multiple Kibana Index patterns for elastiflow-%{type}-%{+YYYY.MM.dd}
I'll probably need to change the Index Template to match elastiflow-*
If for some reason we want to search the entire data set as the default Elastiflow, there would be another Kibana Index pattern for elastiflow-*
that should match all.
@leonardochen what you mention is how I handle datatypes when combining various log sources, firewalls, IDS as well as flow. The indices are hierarchically named, and the top-level is log-*
. Below that there are nearly 80 different index patterns.
Hi, I've started to run an exprimental node of ElastiFlow which collects NetFlow from several devices so far. When I run visualizations i get timeout or no data. The thing is that my indices grow to more than 400GB each (daily), and i think that that might be the couse. I thought of setting up a dedicated index for each device group and decrise the index size by that. Would it be helpful ? (and if so - how to do it?)
Is there a way to analyze the proccessing work for a request and to reveal where are the bottelnecks?
Is there a way to configure the application to use all of the CPU cores (24) ?
I would be glad to receive suggestions how to check and improve performance.