Open ferdose7 opened 2 years ago
Please refer to the comments inline.
Can you define what you mean by "syslog/lumberjack server"? Is it a single host with both Syslog server that you are sending data to with a Syslog output and a Logstash server that you are sending data to with the Lumberjack Output?
The syslog and lumberjack are the different output plugins used to send log events to the external server. We have seen issue in two different clusters. One of the clusters is using lumberjack and elasticsearch outputs and another cluster is using syslog and elasticsearch outputs. Example: Lumberjack configuration from first cluster:
output {
lumberjack {
id => "lumberjack1"
hosts => ["host.com"]
codec => json
port => 8888
ssl_certificate => "/run/secrets/lumberjackOutput-certs/tls.crt"
}
}
Syslog configuration from second cluster:
output {
syslog {
host => "host.com"
port => 8080
protocol => "ssl-tcp"
rfc => rfc5424
use_labels => false
appname => "%{appname}"
priority => "%{priority}"
message => "%{message}"
sourcehost => "ccrc"
procid => "%{[metadata][proc_id]}"
msgid => "%{[metadata][category]}"
ssl_cert => "/run/secrets/syslogOutput-certs/tls.crt"
ssl_key => "/run/secrets/syslogOutput-certs/tls.key"
ssl_cacert => ["/run/secrets/syslogOutput-cacerts/trustedcert"]
ssl_verify => true
}
}
For the first cluster, issue was seen when lumberjack server is down. For the second cluster, issue is seen when syslog server is down.
Can you share the general shape of those pipelines (what inputs, rough quantity and kinds of filters, what quantity and kind of outputs are in each)?
Please refer to the attached sample config used in logstash.
Can you tell me whether the pipelines are related to each other (for example, using pipeline-to-pipeline, or any other way of sending the events from one pipeline to another), and if so, how?
We are not using pipeline-to-pipeline. Please refer to attached sample configuration used in Logstash.
We are using Logstash 7.15.2 version. The Logstash pipeline is configured as follows.
`pipelines.yml:
Logstash is configured with syslog/lumberjack server and elasticsearch output pipelines.
[2022-10-28T13:31:53.053Z][INFO ][logstash.agent ] Pipelines running {:count=>3, :running_pipelines=>[:syslog, :elasticsearch, :logstash], :non_running_pipelines=>[]}
The following steps may trigger the issue:
Deploy Logstash with syslog/lumberjack server and elasticsearch output pipelines.
Make syslog/lumberjack server down. We see Logstash keep trying to connect to the server, but it fails as syslog/lumberjack server is down.
Logstash keeps trying to connect to syslog/lumberjack server.
[2022-11-04T15:39:15.015Z][WARN ][logstash.outputs.syslog ] syslog ssl-tcp output exception: closing, reconnecting and resending event {:host=>"host.com", :port=>8080, :exception=>#<SocketError: initialize: name or service not known>, :backtrace=>["org/jruby/ext/socket/RubyTCPSocket.java:141:in
initialize'", "org/jruby/RubyIO.java:876:innew'", "/opt/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-syslog-3.0.5.E001/lib/logstash/outputs/syslog.rb:219:in
connect'", "/opt/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-syslog-3.0.5.E001/lib/logstash/outputs/syslog.rb:187:inpublish'", "/opt/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-codec-plain-3.1.0/lib/logstash/codecs/plain.rb:59:in
encode'", "/opt/logstash/logstash-core/lib/logstash/codecs/delegator.rb:48:inblock in encode'", "org/logstash/instrument/metrics/AbstractSimpleMetricExt.java:65:in
time'", "org/logstash/instrument/metrics/AbstractNamespacedMetricExt.java:64:intime'", "/opt/logstash/logstash-core/lib/logstash/codecs/delegator.rb:47:in
encode'", "/opt/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-syslog-3.0.5.E001/lib/logstash/outputs/syslog.rb:147:inreceive'", "/opt/logstash/logstash-core/lib/logstash/outputs/base.rb:105:in
block in multi_receive'", "org/jruby/RubyArray.java:1820:ineach'", "/opt/logstash/logstash-core/lib/logstash/outputs/base.rb:105:in
multi_receive'", "org/logstash/config/ir/compiler/OutputStrategyExt.java:143:inmulti_receive'", "org/logstash/config/ir/compiler/AbstractOutputDelegatorExt.java:121:in
multi_receive'", "/opt/logstash/logstash-core/lib/logstash/java_pipeline.rb:295:inblock in start_workers'"], :event=>#<LogStash::Event:0x7407a15a>}
Since the syslog/lumberjack output is blocked, log events will be keep added to the persistent queue and syslog persistent queue becomes full.
After some time Logstash becomes stuck, and we do not see Logstash trying to connect to syslog/lumberjack server.
Bring syslog/lumberjack server up and running.
Even after the syslog/lumberjack server is up and running, Logstash does not try to connect to syslog/lumberjack server and Logstash remains stuck.
The issue is intermittent. eric-log-transformer-795d7cf554-c8jmw.txt
Please find attached thread dump and logs. Could you please check the issue. Please let us know if any details required.