radiumfu / enterprise-log-search-and-archive

Automatically exported from code.google.com/p/enterprise-log-search-and-archive
0 stars 0 forks source link

'Config Calc' to help provide config/tuning values for ELSA & logging #65

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
An Admin page in ELSA for a 'Config Calculator' or 'Tuning Calculator' or the 
like which was able to provide suggested values for the elsa_node.conf based on 
input from the user and was able to auto-populate the elsa_node.conf for a 
given node based on the values suggested.  There could be a couple modes of the 
calculator available which was able to provide diff outputs based on different 
inputs.  For example:

Method 1)

Inputs:

- Auto read avg logs/sec & avg size of uncompressed logs/sec from node(s) 
(averaged over the course of at least 1 day for more accurate recommendations - 
or alert user that results may not be accurate)
- User provides desired # of days of indexed logs
- User provides desired # of days of archived logs

Outputs:

- Required storage to allocate to ELSA (but with a warning this is an estimate)
- % to allocate to index vs archive
- Recommended settings for index files so indexes don't roll prematurely if low 
utilization

Method 2)

- Auto read avg logs/sec & avg size of uncompressed logs/sec from node(s) 
(averaged over the course of at least 1 day for more accurate recommendations - 
or alert user that results may not be accurate)
- User provides storage to allocate to ELSA
- User provides % to allocate to index
- User provides % to allocate to archive

Outputs:

- # of days of Indexed logs (but with a warning this is an estimate)
- # of days of archived logs (but with a warning this is an estimate)
- Recommended settings for index files so indexes don't roll prematurely if low 
utilization

In either case, an option to 'commit' results would be nice - but that would 
require ssh creds to the remote node in order to push the updated 
elsa_node.conf.

Original issue reported on code.google.com by jeffrey....@gmail.com on 28 Aug 2012 at 4:50

GoogleCodeExporter commented 9 years ago
This is a good idea, but I think I want to put this behind the per-host stats 
interface, as that's what will tell the admin where all the logs are going.

Original comment by mchol...@gmail.com on 28 Aug 2012 at 4:00

GoogleCodeExporter commented 9 years ago
How can you manually calculate settings? Just the manually formal would help a 
lot.

Ex.

Daily logs: 3000000
Disk space for logging: 200Gb
Lightning speed search for the entire data logged to disk. When the logging 
disk space is full discard oldest logs

HW:
8 cores 3Ghz Intel 64bit 
8 hdd 72Gb 15K disk in raid 10
32 Gb RAM

Original comment by jacobrav...@gmail.com on 5 Mar 2013 at 7:53

GoogleCodeExporter commented 9 years ago
Hi there, I was planning on posting a similar question. So here goes what I 
would do based on my (little) experience with ELSA:

- Due to your "lightning speed search" requirement, I assume you don't want 
ELSA to go through the archived logs at all (these are compressed non-indexed 
Mysql tables, so it takes a lot of processing power to get data out of them) 
- Also considering the "when disk is full discard oldest logs" I assume you 
don't mind losing the logs.
Therefore, first thing I would do is setting the 'archive/percentage' setting 
to 0 (so logs are either in an index or nowhere).

- Then based on your specs (200GB) and my ELSA instances DB average values 
(about 300 bytes per table row that includes original message, all parsed 
'fields' and MySQL index space), I would say you can keep in store up to around 
710M records (about 240 days worth of logs online)

So, I would play with a config like this (still not sure about the # of indexes)

log_size_limit: 200000000
num_indexes: 400 (I would try this just to help prevent temp index 
consolidation by time, and force Sphinx use more RAM for temp indexes - but 
this point in particular needs expert advise)
index_interval: 120 seconds (I believe this will 'extend' the life of the temp 
indexes in RAM and therefore help speed up searching)
allowed_mem_percent: 80 (you still keep ~ 7gb free for buffers/other services)
perm_index_size: 10000000 (based on num_indexes this value should be enough to 
consume available storage before overwriting indexes)

No idea about how you could take more advantage of the multiple cores... 

But hey, I was about to post a similar question, so this is just 'work in 
progress' :-)

Good luck.

Original comment by rhatu...@gmail.com on 5 Mar 2013 at 6:39

GoogleCodeExporter commented 9 years ago
Yes, those are excellent tuning settings for your workload.  They should
work nicely and provide around 180 days of logs.  One other setting you
should set would be allowed_temp_percent, which I would raise to 80.  This
allows 80% of 400 temp indexes to be used before they are consolidated into
one permanent index.  Otherwise, you would get the default of 40% of 400,
which is a lot less time overall.

On Tue, Mar 5, 2013 at 12:39 PM, <
enterprise-log-search-and-archive@googlecode.com> wrote:

Original comment by mchol...@gmail.com on 5 Mar 2013 at 10:17

GoogleCodeExporter commented 9 years ago
Excellent! Thanks a lot for the advise Martin.

One more on this thread, it's still not 100% clear to me if there is multiple 
(small) permanent index consolidation into one single permanent index in ELSA. 
Is this the case? 

For example I find this useful to cope with days where there is very little 
logging, like a weekend, and prevent these under utilised permanent indexes to 
force the rotation of other, better used, perm indexes.

Original comment by rhatu...@gmail.com on 6 Mar 2013 at 8:02

GoogleCodeExporter commented 9 years ago
Indexes will get consolidated whenever a table has more rows than
"perm_index_size" and consists of more than a single index, so your weekend
perm indexes should get further consolidated as necessary.

On Wednesday, March 6, 2013, wrote:

Original comment by mchol...@gmail.com on 6 Mar 2013 at 2:56

GoogleCodeExporter commented 9 years ago
That's perfect, it actually clarifies some doubts I had on that area of the 
code. 
Thanks for the response man.

Original comment by rhatu...@gmail.com on 6 Mar 2013 at 3:01

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
How to i set the conf's for the following spec?

Daily logs: 24000000 (24 Million logs pr day)
Disk space for logging: 12 TB

Original comment by jacobrav...@gmail.com on 2 Apr 2013 at 10:54

GoogleCodeExporter commented 9 years ago
Figuring an average log size of 300 bytes, 24 million logs per day will take 
about 18 GB of disk space per day, so 12 TB will afford a very long retention 
time (about two years).  You shouldn't need to change the number of indexes 
anymore, because now ELSA will continually consolidate indexes as needed.  
However, you should set the sphinx/perm_index_size to 80 million from the 
default of 10 million to make sure that each of the 200 possible indexes will 
store the full amount to get to the roughly 15 billion logs 12 TB should hold.

Original comment by mchol...@gmail.com on 3 Apr 2013 at 4:32