giraffi / fluent-plugin-amqp

Use AMQP broker to send or receive messages via FluentD
MIT License
15 stars 31 forks source link

RMQ Failover doesn't appear to be working #51

Open warmfusion opened 7 years ago

warmfusion commented 7 years ago

Symptom

  1. First node of cluster is offline.
  2. Fluent connects to first node- timesout and indicates second will be used

Problem

Log events show message Io timeouts suggesting there may be a problem with failover as events never get sent out, and the buffers fill with messages.

Logging


Jul 20 13:51:16 webproxyprod02 fluentd[6072]: E, [2017-07-20T13:51:16.609199 #6162] ERROR -- #<Bunny::Session:0x26df838 fluent.writer@rmq03.brk.example.tld:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Got an exception when sending data: IO timeout when writing to socket (Timeout::Error)
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.609299 #6162]  WARN -- #<Bunny::Session:0x26df838 fluent.writer@rmq03.brk.example.tld:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Will recover from a network failure (no retry limit)...
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.630369 #6162]  WARN -- #<Bunny::Session:0x26df838 fluent.writer@rmq01.brk.example.tld:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Retrying connection on next host in line: rmq01.brk.example.tld:5672
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.632868 #6162]  WARN -- #<Bunny::Session:0x26df838 fluent.writer@rmq01.brk.example.tld:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Could not establish TCP connection to rmq01.brk.example.tld:5672: Connection refused - connect(2) for 172.20.4.4:5672
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: W, [2017-07-20T13:51:16.632941 #6162]  WARN -- #<Bunny::Session:0x26df838 fluent.writer@rmq02.brk.example.tld:5672, vhost=fluent, addresses=[rmq01.brk.example.tld:5672,rmq02.brk.example.tld:5672,rmq03.brk.example.tld:5672]>: Will try to connect to the next endpoint in line: rmq02.brk.example.tld:5672
Jul 20 13:51:16 webproxyprod02 fluentd[6072]: 2017-07-20 13:51:16 +0000 [warn]: #0 buffer flush took longer time than slow_flush_log_threshold: elapsed_time=30.261541206855327 slow_flush_log_threshold=20.0 plugin_id="object:12c2440"
maxpain commented 5 years ago

Same problem

maxpain commented 5 years ago

@warmfusion did you solve the problem?

warmfusion commented 5 years ago

Its not something i've noticed recently after upgrading to newer versions of most of the components to be honest. And we have issues with message brokers that mean we'd expect to see this issue a lot.

I'd go with "No?" but as you're seeing the issue too can you give me more details on the error and versions involved?

maxpain commented 5 years ago

@warmfusion

Version of FluentD: v1.7.4

Logs when the connection between fluent and rabbitmq was down:

E, [2019-12-05T07:59:41.662072 #16] ERROR -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Got an exception when receiving data: Connection reset by peer (Errno::ECONNRESET)
W, [2019-12-05T07:59:41.662346 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Exception in the reader loop: Errno::ECONNRESET: Connection reset by peer
W, [2019-12-05T07:59:41.662381 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Backtrace:
W, [2019-12-05T07:59:41.662407 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/2.6.0/socket.rb:452:in `__read_nonblock'
W, [2019-12-05T07:59:41.662429 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/2.6.0/socket.rb:452:in `read_nonblock'
W, [2019-12-05T07:59:41.662452 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/cruby/socket.rb:55:in `block in read_fully'
W, [2019-12-05T07:59:41.662477 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/cruby/socket.rb:54:in `loop'
W, [2019-12-05T07:59:41.662501 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/cruby/socket.rb:54:in `read_fully'
W, [2019-12-05T07:59:41.662523 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/transport.rb:239:in `read_fully'
W, [2019-12-05T07:59:41.662544 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/transport.rb:261:in `read_next_frame'
W, [2019-12-05T07:59:41.662573 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:74:in `run_once'
W, [2019-12-05T07:59:41.662596 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:39:in `block in run_loop'
W, [2019-12-05T07:59:41.662620 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:36:in `loop'
W, [2019-12-05T07:59:41.662642 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>:    /usr/local/lib/ruby/gems/2.6.0/gems/bunny-2.14.3/lib/bunny/reader_loop.rb:36:in `run_loop'
W, [2019-12-05T07:59:41.662683 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Will recover from a network failure (no retry limit)...
W, [2019-12-05T07:59:51.663312 #16]  WARN -- #<Bunny::Session:0x7f40e1bf9a40 gameflare@white-guppy.rmq.cloudamqp.com:5672, vhost=gameflare, addresses=[white-guppy.rmq.cloudamqp.com:5672]>: Retrying connection on next host in line: white-guppy.rmq.cloudamqp.com:5672
maxpain commented 4 years ago

@warmfusion please...

maxpain commented 4 years ago

@warmfusion Can you please fix this bug? I can give you some money for this..

maxpain commented 4 years ago

@warmfusion We still have problems with this

maxpain commented 3 years ago

Any news?