elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
68 stars 3.5k forks source link

Improve logstash 5.1.1 robustness. #6496

Open cnico opened 7 years ago

cnico commented 7 years ago

Hello,

I tried a migration from my elasticstack to 5.1.1 version (I use 2.3.1 logstash version actually). I had tested globally everything following the migration guide and breaking changes and tested it successfully with representative data for all my input file types. So I tried to migrate my stack on my production server.

Unfortunately, I fall on a mystery problem with this stack trace in logstash after a few data processing during 5 to 20 minutes : [2017-01-04T11:00:20,792][ERROR][logstash.pipeline ] .runtime.Block.yield(Block.java:142)", "org.jruby.RubyArray.eachCommon(RubyArray.java:1606)", "org.jruby.RubyArray.each(RubyArray.java:1613)", "org.jruby.RubyArray$INVOKER$i$0$0$each.call(RubyArray$INVOKER$i$0$0$each.gen)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "rubyjit.LogStash::Filters::Base$$multi_filter_6c0dea8219a042f89f5a5b41e60697f8088a7c451846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/filters/base.rb:161)", "rubyjit.LogStash::Filters::Base$$multi_filter_6c0dea8219a042f89f5a5b41e60697f8088a7c451846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/filters/base.rb)", "org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:181)", "org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:326)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:170)", "rubyjit.LogStash::FilterDelegator$$multi_filter_87d312cb29252318f5ba9dfc74f2695ff3ca582a1846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/filter_delegator.rb:41)", "rubyjit.LogStash::FilterDelegator$$multi_filter_87d312cb29252318f5ba9dfc74f2695ff3ca582a1846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/filter_delegator.rb)", "org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:181)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)", "org.jruby.ast.DAsgnNode.interpret(DAsgnNode.java:110)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.ast.IfNode.interpret(IfNode.java:116)", "org.jruby.ast.IfNode.interpret(IfNode.java:118)", "org.jruby.ast.IfNode.interpret(IfNode.java:118)", "org.jruby.ast.IfNode.interpret(IfNode.java:118)", "org.jruby.ast.IfNode.interpret(IfNode.java:118)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:157)", "org.jruby.runtime.Block.yield(Block.java:142)", "org.jruby.RubyArray.eachCommon(RubyArray.java:1606)", "org.jruby.RubyArray.each(RubyArray.java:1613)", "org.jruby.RubyArray$INVOKER$i$0$0$each.call(RubyArray$INVOKER$i$0$0$each.gen)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "org.jruby.ast.CallNoArgBlockNode.interpret(CallNoArgBlockNode.java:64)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call19(RubyProc.java:281)", "org.jruby.RubyProc$INVOKER$i$0$0$call19.call(RubyProc$INVOKER$i$0$0$call19.gen)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)", "org.jruby.ast.DAsgnNode.interpret(DAsgnNode.java:110)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.internal.runtime.methods.ProcMethod.call(ProcMethod.java:64)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1846274136.block_0$RUBY$__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb:295)", "rubyjit$LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1846274136$block_0$RUBY$__file__.call(rubyjit$LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1846274136$block_0$RUBY$__file__)", "org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)", "org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call19(RubyProc.java:281)", "org.jruby.RubyProc$INVOKER$i$0$0$call19.call(RubyProc$INVOKER$i$0$0$call19.gen)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "rubyjit.LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1846274136.block_0$RUBY$__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:192)", "rubyjit$LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1846274136$block_0$RUBY$__file__.call(rubyjit$LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1846274136$block_0$RUBY$__file__)", "org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:135)", "org.jruby.runtime.Block.yield(Block.java:142)", "org.jruby.RubyHash$13.visit(RubyHash.java:1355)", "org.jruby.RubyHash.visitLimited(RubyHash.java:648)", "org.jruby.RubyHash.visitAll(RubyHash.java:634)", "org.jruby.RubyHash.iteratorVisitAll(RubyHash.java:1306)", "org.jruby.RubyHash.each_pairCommon(RubyHash.java:1351)", "org.jruby.RubyHash.each19(RubyHash.java:1342)", "org.jruby.RubyHash$INVOKER$i$0$0$each19.call(RubyHash$INVOKER$i$0$0$each19.gen)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "rubyjit.LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:191)", "rubyjit.LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb)", "org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:161)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1846274136.chained_0_rescue_1$RUBY$SYNTHETIC__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb:294)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1846274136.__file__(/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb)", "org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:181)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.ast.WhileNode.interpret(WhileNode.java:131)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)", "org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:225)", "org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:219)", "org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:346)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:204)", "org.jruby.ast.FCallTwoArgNode.interpret(FCallTwoArgNode.java:38)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call(RubyProc.java:230)", "org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)", "java.lang.Thread.run(Thread.java:745)"]}

Once this trace is present in the log, my filebeat starts tracing errors of timeout like in this page : https://discuss.elastic.co/t/filebeat-error-err-failed-to-publish-events-caused-by-read-tcp-i-o-timeout/68023 and https://discuss.elastic.co/t/info-error-publishing-events-retrying-read-tcp/68742/13 I understand that my problem is that "the logstash pipeline is blocked" but I don't know why and the log message does not help.

Another problem and strange error message I encountered is the following : [2017-01-04T11:54:43,525][ERROR][logstash.outputs.elasticsearch] Encountered an unexpected error submitting a bulk request! Will retry. {:error_message=>"undefined methodresponse' for #", :class=>"NoMethodError", :backtrace=>["/home/admtdoc/proto-elk-nca/logstash-5.1.1/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:223:in safe_bulk'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:187:insafe_bulk'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:109:in submit'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:76:inretrying_submit'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:27:in multi_receive'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:12:inmulti_receive'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/output_delegator.rb:42:in multi_receive'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb:331:inoutput_batch'", "org/jruby/RubyHash.java:1342:in each'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb:330:inoutput_batch'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb:288:in worker_loop'", "/home/admtdoc/proto-elk-nca/logstash-5.1.1/logstash-core/lib/logstash/pipeline.rb:258:instart_workers'"]`

With logstash 2.3.1, there were sometimes error handling data in my ruby code or in filter plugins execution (because of malformated data - e.g : invalid date format or numeric format) but it never blocked the entire pipeline as in this case.

Because of this behaviour, the only solution I found to finish my migration is to revert logstash version back to 2.3.1. My migration actually works with filebeat 5.1.1, logstash 2.3.1 and elasticsearch 5.1.1

So please improve robustness and make logstash pipeline never block because otherwise, there is no possibility of upgrade in my case...

Regards,

suyograo commented 7 years ago

@cnico is this the same config you used in 2.3.1? Can you paste the config here? Also, is everything ok with your ES cluster?

jsvd commented 7 years ago

@cnico for the last error (on the es output), please upgrade your logstash-output-elasticsearch plugin, that issue has been fixed in 6.x:

bin/logstash-plugin install --version "6.2.0" logstash-output-elasticsearch

cnico commented 7 years ago

@suyograo, here are my files once converted to logstash 5 : 6496.zip

As you will see, the logstash configuration is quite complex (because of the data structure of my data) and it has been tested successfully on my windows laptop with logstash 5.1.1 with a few files that are representative of the my data.

On my production linux machine, i do not activate the debug informations nor the stdout codec and I have only the error stack trace I sent during data processing. It would be certainly great that for such an error, the event that logstash is currently processing is traced with the error.

@jsvd, thank you for the information. I assume it would be in the next future global release of logstash.

cnico commented 7 years ago

I forgot to answer : yes it works for months on logstash 2.3.1 with the correct syntax on event[] migrated to get and set methods.

My ES server is OK and has no errors.

primeroz commented 7 years ago

i think i just had the very same issue about the logstash pipeline getting stuck.

[2017-02-20T01:03:49,384][ERROR][logstash.pipeline ] (CachingCallSite.java:168)", "org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)", "org.jruby.ast.DAsgnNode.interpret(DAsgnNode.java:110)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.ast.IfNode.interpret(IfNode.java:116)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:157)", "org.jruby.runtime.Block.yield(Block.java:142)", "org.jruby.RubyArray.eachCommon(RubyArray.java:1606)", "org.jruby.RubyArray.each(RubyArray.java:1613)", "org.jruby.RubyArray$INVOKER$i$0$0$each.call(RubyArray$INVOKER$i$0$0$each.gen)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "org.jruby.ast.CallNoArgBlockNode.interpret(CallNoArgBlockNode.java:64)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call19(RubyProc.java:281)", "org.jruby.RubyProc$INVOKER$i$0$0$call19.call(RubyProc$INVOKER$i$0$0$call19.gen)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)", "org.jruby.ast.DAsgnNode.interpret(DAsgnNode.java:110)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.ast.IfNode.interpret(IfNode.java:116)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:157)", "org.jruby.runtime.Block.yield(Block.java:142)", "org.jruby.RubyArray.eachCommon(RubyArray.java:1606)", "org.jruby.RubyArray.each(RubyArray.java:1613)", "org.jruby.RubyArray$INVOKER$i$0$0$each.call(RubyArray$INVOKER$i$0$0$each.gen)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "org.jruby.ast.CallNoArgBlockNode.interpret(CallNoArgBlockNode.java:64)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call19(RubyProc.java:281)", "org.jruby.RubyProc$INVOKER$i$0$0$call19.call(RubyProc$INVOKER$i$0$0$call19.gen)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57)", "org.jruby.ast.DAsgnNode.interpret(DAsgnNode.java:110)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.internal.runtime.methods.ProcMethod.call(ProcMethod.java:64)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1442407170.block_0$RUBY$file(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:295)", "rubyjit$LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1442407170$block_0$RUBY$file.call(rubyjit$LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1442407170$block_0$RUBY$file)", "org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)", "org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call19(RubyProc.java:281)", "org.jruby.RubyProc$INVOKER$i$0$0$call19.call(RubyProc$INVOKER$i$0$0$call19.gen)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "rubyjit.LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1442407170.block_0$RUBY$file(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:192)", "rubyjit$LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1442407170$block_0$RUBY$file.call(rubyjit$LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1442407170$block_0$RUBY$file)", "org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:135)", "org.jruby.runtime.Block.yield(Block.java:142)", "org.jruby.RubyHash$13.visit(RubyHash.java:1355)", "org.jruby.RubyHash.visitLimited(RubyHash.java:648)", "org.jruby.RubyHash.visitAll(RubyHash.java:634)", "org.jruby.RubyHash.iteratorVisitAll(RubyHash.java:1306)", "org.jruby.RubyHash.each_pairCommon(RubyHash.java:1351)", "org.jruby.RubyHash.each19(RubyHash.java:1342)", "org.jruby.RubyHash$INVOKER$i$0$0$each19.call(RubyHash$INVOKER$i$0$0$each19.gen)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "rubyjit.LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1442407170.file(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:191)", "rubyjit.LogStash::Util::WrappedSynchronousQueue::ReadBatch$$each_8ee46fd8c62d1155253b7a084f2e67ff25953d7e1442407170.file(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb)", "org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:161)", "org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)", "org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1442407170.chained_0_rescue_1$RUBY$SYNTHETICfile(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:294)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1442407170.file(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb)", "rubyjit.LogStash::Pipeline$$filter_batch_390eee8140cc612b1771fb72e9904822f536cabf1442407170.file(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb)", "org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:181)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)", "org.jruby.ast.FCallOneArgNode.interpret(FCallOneArgNode.java:36)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.ast.WhileNode.interpret(WhileNode.java:131)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_METHOD(ASTInterpreter.java:74)", "org.jruby.internal.runtime.methods.InterpretedMethod.call(InterpretedMethod.java:225)", "org.jruby.internal.runtime.methods.DefaultMethod.call(DefaultMethod.java:219)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)", "org.jruby.ast.FCallTwoArgNode.interpret(FCallTwoArgNode.java:38)", "org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105)", "org.jruby.ast.BlockNode.interpret(BlockNode.java:71)", "org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)", "org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:206)", "org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:194)", "org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:125)", "org.jruby.runtime.Block.call(Block.java:101)", "org.jruby.RubyProc.call(RubyProc.java:300)", "org.jruby.RubyProc.call(RubyProc.java:230)", "org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)", "java.lang.Thread.run(Thread.java:745)"]}

I have 2 set of logstash instances pulling events from 2 redis queues which contain a copy of the same event. I use one for production and one for testing so effectively the 2 set of logstash instances parse the very same events.

each set has 3 logstash instances , the test one was 5.2.0 (upgraded this morning to 5.2.1) and the production one is 5.1.2

Interestingly enough i had 2 instances in prod and 1 in test effectively all get stuck at the same time

prod-1: [2017-02-20T01:03:49,384][ERROR][logstash.pipeline        ]
prod-2: [2017-02-20T01:03:51,862][ERROR][logstash.pipeline        ]
test-1:  [2017-02-20T01:03:51,699][ERROR][logstash.pipeline        ]

Since the parse the same events i would think that there must have been some documents that caused the instances to fail , and they must have been pulled at the same time ... but i have no way to find them because the instances got stuck , and saved no messages in the file for failures , and the amount of input data is way too big to go and look for it at the source.

This setup have been working for months, so i doubt is something very simple to replicate ... is there anything in the jruby stack trace that could suggest where to look for the issue ? is quite unreadable to me

Looking at the stack trace in logstash-stderr (way too long to post here so in a pastebin) seems like my KV filter bombed out ? LOGS here : https://paste.debian.net/915853/