Closed original-brownbear closed 6 years ago
This is impacting our production environment, causing periodic data loss from our application error logs (nothing critical yet), and could also impact our SQL activity data stream collection (also using UDP input) and our compliance status.
We're presently being alerted when data stops being ingested for that index pattern, and restarting Logstash - not a solution - and will be considering alternate options (filebeat or custom UDP proxy creation).
@jordansissel @andrewvc fyi (in case you want to schedule this in some form :))
Yes, we should fix this bug, would you have the spare cycles to tackle this next week @original-brownbear ?
@andrewvc sure thing, top of the pile then :)
@original-brownbear Happy New Year! I just want to follow up on this bug. Have we started tackling this? Restarting Logstash has become a bit of an annoyance .
@andrewvc https://github.com/logstash-plugins/logstash-codec-multiline/pull/60 should do it I hope :)
@jkelastic ^^ I think I got this fixed in the above :)
@original-brownbear thanks, I don't see a version label. I'm guessing it is fixed in ES 5.6 and above.
@jkelastic sorry still waiting for review there, will push on this tomorrow :)
@original-brownbear no problem, thanks for update : )
Likely not the same underlying problem, but the behavior is the same - single port (UDP, multiline) stops accepting data, if using persisted queue, queue must be deleted before data will be processed again on that port. Can reproduce on-demand if needed, and just appended issue to existing Elastic ticket opened with @jkelastic (if different issue will open new ticket\bug).
(top portion of stack trace)
[2018-01-10T11:06:05,199][DEBUG][logstash.util.decorators ] inputs/LogStash::Inputs::Udp: adding value to field {"field"=>"[@metadata][inputtype]", "value"=>["truxetrace"]}
Exception in thread "[main]>worker2" org.logstash.ackedqueue.QueueRuntimeException: deserialize invocation error
at org.logstash.ackedqueue.Queue.deserialize(Queue.java:638)
at org.logstash.ackedqueue.Page.readBatch(Page.java:60)
at org.logstash.ackedqueue.Queue._readPageBatch(Queue.java:505)
at org.logstash.ackedqueue.Queue.readBatch(Queue.java:496)
at org.logstash.ackedqueue.ext.JrubyAckedQueueExtLibrary$RubyAckedQueue.ruby_read_batch(JrubyAckedQueueExtLibrary.java:162)
at org.logstash.ackedqueue.ext.JrubyAckedQueueExtLibrary$RubyAckedQueue$INVOKER$i$2$0$ruby_read_batch.call(JrubyAckedQueueExtLibrary$RubyAckedQueue$INVOKER$i$2$0$ruby_read_batch.gen)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)
at rubyjit.LogStash::Util::WrappedAckedQueue$$read_batch_d49228611d045c1563109c40e125e11f223ebb3d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:86)
at rubyjit.LogStash::Util::WrappedAckedQueue$$read_batch_d49228611d045c1563109c40e125e11f223ebb3d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:221)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadBatch$$read_next_b8fddaedce1c7efd76622fff6ba60df85ed33b8d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:275)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadBatch$$read_next_b8fddaedce1c7efd76622fff6ba60df85ed33b8d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:141)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:134)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadClient$$read_batch_fcf4821efe51d29d2ff1b09c121b6d1cbbd684311808253012.chained_0_ensure_1$RUBY$__ensure__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:186)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadClient$$read_batch_fcf4821efe51d29d2ff1b09c121b6d1cbbd684311808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:185)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadClient$$read_batch_fcf4821efe51d29d2ff1b09c121b6d1cbbd684311808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:141)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:134)
@original-brownbear @akatakatoo Here are the test results. There is good and bad news. The good news is the error and problems don't seem to affect LS 6.0 we aren't ready to upgrade to ES 6.0 with SSL.
In ES 5.5 , here are the problems the we're having.
After #1 error below happens, then a logstash restart it will start throwing different errors, see error #2 below
ERROR #1
[2018-01-30T07:23:02,624][DEBUG][logstash.util.decorators ] inputs/LogStash::Inputs::Udp: adding value to field {"field"=>"[@metadata][inputtype]", "value"=>["truxetrace"]}
Exception in thread "[main]>worker0" org.logstash.ackedqueue.QueueRuntimeException: deserialize invocation error
at org.logstash.ackedqueue.Queue.deserialize(Queue.java:638)
at org.logstash.ackedqueue.Page.readBatch(Page.java:60)
at org.logstash.ackedqueue.Queue._readPageBatch(Queue.java:505)
at org.logstash.ackedqueue.Queue.readBatch(Queue.java:496)
at org.logstash.ackedqueue.ext.JrubyAckedQueueExtLibrary$RubyAckedQueue.ruby_read_batch(JrubyAckedQueueExtLibrary.java:162)
at org.logstash.ackedqueue.ext.JrubyAckedQueueExtLibrary$RubyAckedQueue$INVOKER$i$2$0$ruby_read_batch.call(JrubyAckedQueueExtLibrary$RubyAckedQueue$INVOKER$i$2$0$ruby_read_batch.gen)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)
at rubyjit.LogStash::Util::WrappedAckedQueue$$read_batch_d49228611d045c1563109c40e125e11f223ebb3d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:86)
at rubyjit.LogStash::Util::WrappedAckedQueue$$read_batch_d49228611d045c1563109c40e125e11f223ebb3d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:221)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadBatch$$read_next_b8fddaedce1c7efd76622fff6ba60df85ed33b8d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:275)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadBatch$$read_next_b8fddaedce1c7efd76622fff6ba60df85ed33b8d1808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:141)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:134)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadClient$$read_batch_fcf4821efe51d29d2ff1b09c121b6d1cbbd684311808253012.chained_0_ensure_1$RUBY$__ensure__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:186)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadClient$$read_batch_fcf4821efe51d29d2ff1b09c121b6d1cbbd684311808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb:185)
at rubyjit.LogStash::Util::WrappedAckedQueue::ReadClient$$read_batch_fcf4821efe51d29d2ff1b09c121b6d1cbbd684311808253012.__file__(E:/Logstash/logstash-core/lib/logstash/util/wrapped_acked_queue.rb)
...
ERROR #2
[2018-01-30T07:39:02,728][DEBUG][logstash.outputs.file ] 0 stale files found {:inactive_files=>{}}
Exception in thread "[main]>worker1" org.logstash.ackedqueue.QueueRuntimeException: deserialize invocation error
at org.logstash.ackedqueue.Queue.deserialize(Queue.java:638)
at org.logstash.ackedqueue.Page.readBatch(Page.java:60)
at org.logstash.ackedqueue.Queue._readPageBatch(Queue.java:505)
at org.logstash.ackedqueue.Queue.readBatch(Queue.java:496)
at org.logstash.ackedqueue.ext.JrubyAckedQueueExtLibrary$RubyAckedQueue.ruby_read_batch(JrubyAckedQueueExtLibrary.java:162)
Attached Config, logs, console log (where the actual full error resides - logstash_console_log.txt), logstash binary directory, output folder (just writing to a file\console, no ES) and persisted queue folder (which needs to be deleted between failures to obtain original error) and multiline gem file used.
Note logstash is being run as follows:
E:\Logstash\bin\logstash.bat -f E:\elkconfig\logstash\config.json --path.settings E:/elkconfig/logstash -l "E:\elklogs\Logstash"
@jkelastic I think this is an unrelated PQ bug, I've opened a new issue for it here: https://github.com/elastic/logstash/issues/9104
@jkelastic Could you please provide complete environment information in elastic/logstash#9104 Exact LS version and OS version?
@colinsurprenant
The error below just happened last night on one of our production Logstash servers and stopped accepting traffic on one UDP port - only restarting Logstash allowed it to recover (logstash 5.6.5).
Version:
logstash 5.6.5
jruby 1.7.27 (1.9.3p551) 2017-05-11 8cdb01a on Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 +jit [Windows Server 2012 R2-amd64]
java 1.8.0_131 (Oracle Corporation)
jvm Java HotSpot(TM) 64-Bit Server VM / 25.131-b11
logstash-codec-multiline-3.0.9
Error:
[2018-03-07T18:57:14,513][ERROR][logstash.inputs.udp ] Exception in inputworker {"exception"=>#<<'", "org/jruby/RubyArray.java:1168:in
push'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:211:in buffer'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:269:in
do_previous'", "org/jruby/RubyMethod.java:124:in call'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:205:in
decode'", "org/jruby/RubyArray.java:1613:in each'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:198:in
decode'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:118:in inputworker'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:89:in
udp_listener'"]}
[2018-03-07T18:57:14,513][ERROR][logstash.inputs.udp ] Exception in inputworker {"exception"=>#<<'", "org/jruby/RubyArray.java:1168:in
push'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:211:in buffer'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:269:in
do_previous'", "org/jruby/RubyMethod.java:124:in call'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:205:in
decode'", "org/jruby/RubyArray.java:1613:in each'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:198:in
decode'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:118:in inputworker'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:89:in
udp_listener'"]}
@jkelastic you are reporting using the v3.0.9 of the multiline codec which should have fixed this issue but your logs tells differently:
"E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:211:in buffer'",
It seems you are in fact running v3.0.5. Can you upgrade and reassess?
Happened again after the multiline upgrade - folder shows 3.0.5 but update does nothing (as though it is running the latest, 3.0.9):
[2018-03-08T17:46:39,649][ERROR][logstash.inputs.udp ] Exception in inputworker {"exception"=>#<<'", "org/jruby/RubyArray.java:1168:in
push'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:211:in buffer'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:269:in
do_previous'", "org/jruby/RubyMethod.java:124:in call'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:205:in
decode'", "org/jruby/RubyArray.java:1613:in each'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:198:in
decode'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:118:in inputworker'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:89:in
udp_listener'"]}
[2018-03-08T17:46:39,649][ERROR][logstash.inputs.udp ] Exception in inputworker {"exception"=>#<<'", "org/jruby/RubyArray.java:1168:in
push'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:211:in buffer'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:269:in
do_previous'", "org/jruby/RubyMethod.java:124:in call'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:205:in
decode'", "org/jruby/RubyArray.java:1613:in each'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-multiline-3.0.5/lib/logstash/codecs/multiline.rb:198:in
decode'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:118:in inputworker'", "E:/Logstash/vendor/bundle/jruby/1.9/gems/logstash-input-udp-3.1.1/lib/logstash/inputs/udp.rb:89:in
udp_listener'"]}
E:\elkbinaries\logstash\bin>logstash --version Sending Logstash's logs to E:/elkbinaries/logstash/logs which is now configured via log4j2.properties [2018-03-08T18:07:31,881][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"E:/elkbinaries/logstash/modules/fb_apache/configuration"} [2018-03-08T18:07:31,888][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"E:/elkbinaries/logstash/modules/netflow/configuration"} [2018-03-08T18:07:31,905][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"E:/elkbinaries/logstash/data/queue"} [2018-03-08T18:07:31,910][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"E:/elkbinaries/logstash/data/dead_letter_queue"} logstash 5.6.5 jruby 1.7.27 (1.9.3p551) 2017-05-11 8cdb01a on Java HotSpot(TM) 64-Bit Server VM 1.8.0_131-b11 +jit [Windows Server 2012 R2-amd64] java 1.8.0_131 (Oracle Corporation) jvm Java HotSpot(TM) 64-Bit Server VM / 25.131-b11
E:\elkbinaries\logstash\bin>logstash-plugin update logstash-codec-multiline Updating logstash-codec-multiline No plugin updated
@akatakatoo can you try uninstalling and re-installing logstash-codec-multiline
to see if it will get it to v3.0.9?
Just removed and installed it (with logstash stopped) and although it indicated it could not remove it as it had dependencies, the correct version of files appear to be in place. Now to wait and see if it happens again. Thanks.
E:\elkbinaries\logstash\bin>logstash-plugin remove logstash-codec-multiline Failed to remove "logstash-codec-multiline" because the following plugins or libraries depend on it:
E:\elkbinaries\logstash\bin>logstash-plugin install logstash-codec-multiline Validating logstash-codec-multiline Installing logstash-codec-multiline Installation successful
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\cache
03/08/2018 02:33 PM 22,016 logstash-codec-multiline-3.0.9.gem 1 File(s) 22,016 bytes
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\gems
12/11/2017 02:16 PM
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\gems\logstash-codec-multiline-3.0.8
12/11/2017 02:16 PM 1,303 logstash-codec-multiline.gemspec 1 File(s) 1,303 bytes
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\gems\logstash-codec-multiline-3.0.8\lib\logstash\codecs
12/11/2017 02:16 PM 9,166 multiline.rb 1 File(s) 9,166 bytes
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\gems\logstash-codec-multiline-3.0.9
03/08/2018 02:33 PM 1,303 logstash-codec-multiline.gemspec 1 File(s) 1,303 bytes
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\gems\logstash-codec-multiline-3.0.9\lib\logstash\codecs
03/08/2018 02:33 PM 9,226 multiline.rb 1 File(s) 9,226 bytes
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\gems\logstash-codec-multiline-3.0.9\spec\codecs
03/08/2018 02:33 PM 11,500 multiline_spec.rb 1 File(s) 11,500 bytes
Directory of E:\elkbinaries\logstash\vendor\bundle\jruby\1.9\specifications
12/11/2017 02:17 PM 1,923 logstash-codec-multiline-3.0.8.gemspec 03/08/2018 02:33 PM 1,965 logstash-codec-multiline-3.0.9.gemspec 2 File(s) 3,888 bytes
@akatakatoo did that solve your problem?
The problem has not occurred since the 3.0.9 update, which is a great start. There can be weeks or months between the error occurrences, so I'll continue to monitor for a bit. Thanks.
@akatakatoo thanks for the followup. I will go ahead and close this issue. Please feel free to re-open if you think it is still not solved at some point.
We got a report of this:
It seems like this is coming from the (as far as I can see) asynchronous
flush
functionality somehow accessing the@buffer
field from 2 threads (I verified that the udp input does correctlyclone
the codec, so it must be coming from the codec itself here). Maybe related to the random failures here #46