logstash-plugins / logstash-output-file

Apache License 2.0
23 stars 53 forks source link

Use `concurrency :shared` and sync codecs to optimize performance #46

Closed andrewvc closed 8 years ago

andrewvc commented 8 years ago

This is a WIP. This commit has added locking, but I may have missed some locks in a few spots, so it needs another pass before its really ready.

This provides a nice boost:

Before:

time bin/logstash -e "input { generator { count => 3000000} } filter {  } output { file { path => '/tmp/newfileout'} }"
Settings: Default pipeline workers: 8
Pipeline main started
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}
      139.95 real       223.61 user        28.93 sys

After

rm /tmp/newfileout; time bin/logstash -e "input { generator { count => 3000000} } filter {  } output { file { codec => json_lines path => '/tmp/newfileout'} }" ; ls -lh /tmp/newfileout
Settings: Default pipeline workers: 8
Pipeline main started
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}
       56.12 real       192.99 user        17.38 sys
jsvd commented 8 years ago

my first results:

With logstash 5.0.0-alpha6:

/tmp/logstash-5.0.0-alpha6 % time bin/logstash -e "input { generator { count => 3000000} } filter {  } output { file { path => '/tmp/newfileout'} }" && wc -l /tmp/newfileout
Pipeline main started
Pipeline main has been shutdown
stopping pipeline {:id=>"main", :level=>:warn}
bin/logstash -e   165.91s user 26.94s system 230% cpu 1:23.64 total
 3000000 /tmp/newfileout

Logstash master + logstash-output-file w/ this PR:

~/projects/logstash (git)-[master] % time bin/logstash -e "input { generator { count => 3000000} } filter {  } output { file { path => '/tmp/newfileout_master'} }" && wc -l /tmp/newfileout_master
Pipeline main started
Pipeline main has been shutdown
stopping pipeline {:id=>"main", :level=>:warn}
bin/logstash -e   150.86s user 17.39s system 273% cpu 1:01.57 total
 2999898 /tmp/newfileout_master

I've tried this again:

2999899 /tmp/newfileout_master

jsvd commented 8 years ago

Another error which I only got once I got more than once:

~/projects/logstash (git)-[master] % time bin/logstash -b 20 -e "input { generator { count => 3000000} } filter {  } output { file { path => '/tmp/newfileout_master'} }" && wc -l /tmp/newfileout_master
--- jar coordinate com.fasterxml.jackson.core:jackson-annotations already loaded with version 2.7.3 - omit version 2.7.0
Pipeline main started
An unexpected error occurred! {:error=>#<NameError: undefined method `write' for class `NilClass'>, :backtrace=>["org/jruby/RubyKernel.java:2148:in `method'", "/Users/joaoduarte/new_plugins/logstash-output-file/lib/logstash/outputs/file.rb:142:in `multi_receive_encoded'", "org/jruby/RubyHash.java:1342:in `each'", "/Users/joaoduarte/new_plugins/logstash-output-file/lib/logstash/outputs/file.rb:140:in `multi_receive_encoded'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/outputs/base.rb:89:in `multi_receive'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:11:in `multi_receive'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/output_delegator.rb:50:in `multi_receive'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/pipeline.rb:302:in `output_batch'", "org/jruby/RubyHash.java:1342:in `each'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/pipeline.rb:301:in `output_batch'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/pipeline.rb:252:in `worker_loop'", "/Users/joaoduarte/projects/logstash/logstash-core/lib/logstash/pipeline.rb:225:in `start_workers'"], :level=>:fatal}
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'
2016-08-24 14:29:01 +0100: Listen loop error: #<Errno::EBADF: Bad file descriptor - Bad file descriptor>
org/jruby/RubyIO.java:3705:in `select'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:322:in `handle_servers'
/Users/joaoduarte/projects/logstash/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/server.rb:296:in `run'bin/logstash -b 20 -e   57.31s user 3.50s system 301% cpu 20.166 total
ph commented 8 years ago

@jsvd the error your reporting is strange to say the least, the first part of the message originate from the file output but the remaining part are from puma.

jsvd commented 8 years ago

@ph I just wanted to post the whole error stack, but the puma one is happening frequently on logstash exit, I'm going to open an issue

ph commented 8 years ago

@jsvd I am gonna take a look, we rewrote the puma integration not too long ago, I think we might be missing a rescue on close.

jsvd commented 8 years ago

I created an issue for the puma errors in https://github.com/elastic/logstash/issues/5822

andrewvc commented 8 years ago

@jsvd I've fixed all the issues here as well as made the locking a bit more coarse. I was also able to simplify the code around a simple Mutex rather than a complex ReentrantReadWriteLock

andrewvc commented 8 years ago

@jsvd I've bumped the plugin API so it will require LS 5.0+ to properly handle the obsolescence.

jsvd commented 8 years ago

Tests pass, manual tests also good, nice work LGTM

andrewvc commented 8 years ago

Thanks @jsvd !

elasticsearch-bot commented 8 years ago

Andrew Cholakian merged this into the following branches!

Branch Commits
master 4e8ca378117eae0edec4c2d4623b69590a026af6