voxpupuli / puppet-elasticsearch

Elasticsearch Puppet module
Apache License 2.0
403 stars 479 forks source link

Extremely slow puppet run on machines with lots of indices #532

Closed mvintila closed 8 years ago

mvintila commented 8 years ago

I am running the puppet module on a machine with about ~600G of logstash indices and the run takes upwards of 200seconds everytime. I have straced this to the fact that puppet recurses through the entire datadir, performing stat on evrery file.

The solution is to remove "recurse => true" from $instance_datadir file:

    file { $instance_datadir:                                                   
      ensure  => 'directory',                                                   
      owner   => $elasticsearch::elasticsearch_user,                            
      group   => undef,                                                         
      mode    => '0644',                                                        
      #recurse => true,                                                          
      require => [ Exec["mkdir_datadir_elasticsearch_${name}"], Class['elasticsearch::package'] ],
      before  => Elasticsearch::Service[$name],                                 
    } 
electrical commented 8 years ago

Hi,

I had to add it in case one decides to change the user. But i think it would safe to assume that once its setup the user will never change? If so we can remove the recursive part indeed.

gservat commented 8 years ago

Thanks for reporting this! Not only is it slower, but it also creates massive YAML report files for the Elasticsearch nodes! We have quite a few, and each report is now over 4-5MB (as it recurses into every file in $instance_datadir), and there's a Puppet run every 10 minutes, so disk space fills up quickly. Definitely +1 to remove the recurse line.

gservat commented 8 years ago

With recurse => true:

-rw-r-----  1 puppet puppet 5643252 Dec 22 11:08 201512220007.yaml

... with recurse => false:

-rw-r-----  1 puppet puppet  186455 Dec 22 11:10 201512220010.yaml
electrical commented 8 years ago

I'll do some tests to make sure it doesn't impact anything but i think it should be safe to remove indeed.

psychonaut commented 8 years ago

Also with recent 0.10.1 version it fails with errors like these:

Error: /Stage[main]/Profiles::Elasticsearch/Elasticsearch::Instance[log-es03]/File[/data/elasticsearch]: Failed to generate additional resources using 'eval_generate': No such file or directory - /data/elasticsearch/kibana-logs/nodes/0/indices/heka-2015.12.30/2/index/_jjy.fdx

This is elasticsearch intance for logs so files are frequently created, deleted and edited.

psychonaut commented 8 years ago

I'd say: make it configurable, parametrize recurse with default true. Everybody's happy!

electrical commented 8 years ago

Created a PR to solve this. The only reason the recurse option was added was that some tests failed. Should have solved the actual issue :-)

mfrancka commented 8 years ago

+1

electrical commented 8 years ago

Merged #544 which removes the recurse option. The only reason it was there was to modify owner/group if someone would change user/group after the initial creation, but doubt that will ever happen :-)

ctrlaltdel commented 8 years ago

This is great news, thanks @electrical for the fix!

Before:

Finished catalog run in 693.73 seconds

After upgrading the elasticsearch module and deleting /var/lib/puppet/state.yml:

Notice: Finished catalog run in 12.29 seconds