fluentd: 1.0
influxdb: 1.8
fluent-plugin-influxdb: 2.0
Environment: fluentd and influxdb runs in docker containers
influx task config:
<match **> @type influxdb host db port 8086 dbname mydb retry 10 time_precision ns tag_keys ["application", "type", "fn", "result", "dataVersion", "axonVersion", "message"] time_key time
How to reproduce:
fluentd and influxdb both on, happy
docker stop influxdb; fluentd gets panic, starts reconnecting the db
Expected:
with retry set, I want see the connection retry only happens the number of times I specified
Actual:
Retry happens for ever
Log segment:
W, [2021-06-05T06:25:51.666174 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.01s. W, [2021-06-05T06:25:51.864101 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.02s. W, [2021-06-05T06:25:51.894951 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.04s. W, [2021-06-05T06:25:51.945370 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.08s. W, [2021-06-05T06:25:52.036232 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.16s. W, [2021-06-05T06:25:52.206989 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.32s. W, [2021-06-05T06:25:52.538408 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.64s. W, [2021-06-05T06:25:53.190242 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 1.28s. W, [2021-06-05T06:25:54.481177 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 2.56s. W, [2021-06-05T06:25:57.052269 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 5.12s. W, [2021-06-05T06:26:02.279800 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 10.24s. ....
And on and on, till it reaches the max period, i.e. 30sec, and keep on trying with that for ever.
Obviously, somehow the retry setup does not work at all.
fluentd: 1.0 influxdb: 1.8 fluent-plugin-influxdb: 2.0 Environment: fluentd and influxdb runs in docker containers
influx task config:
<match **> @type influxdb host db port 8086 dbname mydb retry 10 time_precision ns tag_keys ["application", "type", "fn", "result", "dataVersion", "axonVersion", "message"] time_key time
How to reproduce:
Expected: with retry set, I want see the connection retry only happens the number of times I specified
Actual: Retry happens for ever
Log segment:
W, [2021-06-05T06:25:51.666174 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.01s. W, [2021-06-05T06:25:51.864101 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.02s. W, [2021-06-05T06:25:51.894951 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.04s. W, [2021-06-05T06:25:51.945370 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.08s. W, [2021-06-05T06:25:52.036232 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.16s. W, [2021-06-05T06:25:52.206989 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.32s. W, [2021-06-05T06:25:52.538408 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 0.64s. W, [2021-06-05T06:25:53.190242 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 1.28s. W, [2021-06-05T06:25:54.481177 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 2.56s. W, [2021-06-05T06:25:57.052269 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 5.12s. W, [2021-06-05T06:26:02.279800 #18] WARN -- InfluxDB: Failed to contact host db: #<SocketError: Failed to open TCP connection to db:8086 (getaddrinfo: Name does not resolve)> - retrying in 10.24s. ....
And on and on, till it reaches the max period, i.e. 30sec, and keep on trying with that for ever.Obviously, somehow the retry setup does not work at all.
Morever: I do not believe what I saw basically, since form the source code it simply calls the ruby library of influxdb and not too much logic there. So I tried instead taking the retry parameter from config, but hard code I tried hard coded the source code about taking retry parameter at https://github.com/fangli/fluent-plugin-influxdb/blob/ef3e5c359dc8152dde249f0d77d1221ac3181064/lib/fluent/plugin/out_influxdb.rb#L88, make it 10 like:
@influxdb ||= InfluxDB::Client.new @dbname, hosts: @host.split(','), port: @port, username: @user, password: @password, async: false, retry: 10,
Then build the gem, install it in the fluentd image and run, guess what? It works.
I have limited knowledge of Ruby and cannot tell what happens here. Anyone confirms if it is a bug?