fangli / fluent-plugin-influxdb

A buffered output plugin for fluentd and InfluxDB
MIT License
111 stars 65 forks source link

retry parameter not working at all #108

Open singlaive opened 3 years ago

singlaive commented 3 years ago

fluentd: 1.0 influxdb: 1.8 fluent-plugin-influxdb: 2.0 Environment: fluentd and influxdb runs in docker containers

influx task config: <match **> @type influxdb host db port 8086 dbname mydb retry 10 time_precision ns tag_keys ["application", "type", "fn", "result", "dataVersion", "axonVersion", "message"] time_key time

How to reproduce:

  1. fluentd and influxdb both on, happy
  2. docker stop influxdb; fluentd gets panic, starts reconnecting the db

Expected: with retry set, I want see the connection retry only happens the number of times I specified

Actual: Retry happens for ever

Log segment: W, [2021-06-05T06:25:51.666174 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.01s. W, [2021-06-05T06:25:51.864101 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.02s. W, [2021-06-05T06:25:51.894951 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.04s. W, [2021-06-05T06:25:51.945370 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.08s. W, [2021-06-05T06:25:52.036232 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.16s. W, [2021-06-05T06:25:52.206989 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.32s. W, [2021-06-05T06:25:52.538408 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.64s. W, [2021-06-05T06:25:53.190242 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 1.28s. W, [2021-06-05T06:25:54.481177 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 2.56s. W, [2021-06-05T06:25:57.052269 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 5.12s. W, [2021-06-05T06:26:02.279800 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 10.24s. .... And on and on, till it reaches the max period, i.e. 30sec, and keep on trying with that for ever.

Obviously, somehow the retry setup does not work at all.

Morever: I do not believe what I saw basically, since form the source code it simply calls the ruby library of influxdb and not too much logic there. So I tried instead taking the retry parameter from config, but hard code I tried hard coded the source code about taking retry parameter at https://github.com/fangli/fluent-plugin-influxdb/blob/ef3e5c359dc8152dde249f0d77d1221ac3181064/lib/fluent/plugin/out_influxdb.rb#L88, make it 10 like:

@influxdb ||= InfluxDB::Client.new @dbname, hosts: @host.split(','), port: @port, username: @user, password: @password, async: false, retry: 10,

Then build the gem, install it in the fluentd image and run, guess what? It works.

I have limited knowledge of Ruby and cannot tell what happens here. Anyone confirms if it is a bug?