Open zarbis opened 7 years ago
ping @michaelwilde @outcoldman PTAL
@thaJeztah we were responsible for Splunk driver.
@outcoldman oh, brainfart, I was looking on a different issue and completely miswired my thinking, LOL
Slightly related discussion https://github.com/moby/moby/issues/32567
ping @tagomoris @cpuguy83 PTAL
@zarbis do you think this can be avoided by removing fluentd-async-connect ?
@max-lobur Yes, it works this way :) However it's not the optimal solution since those logs are not going to be buffered in case of a fluentd crash, but it's better than crashing the entire docker daemon anyway...
I think we faced the same problem once. Ended up running fluentd container with --network=host, listening on 127.0.0.1 and using fluentd-address=tcp://127.0.0.1:24224 for other containers (with async mode too). Never had any issues since then.
Description
Docker daemon doesn't correctly reconnect running containers' log streams to Fluentd after Fluentd being restarted.
Steps to reproduce the issue:
Configure daemon with Fluentd driver or run container/service with
--log-driver=fluentd
option.Start global Fluent-bit service (reproduces with Fluentd too):
docker service create --name fluent-bit --mode global -p 24224:24224 fluent/fluent-bit:latest
Start some test service:
docker service create --name dater --mode global alpine sh -c 'while true; do date; sleep 1; done'
Confirm that fluent-bit receives logs:
docker service logs -f fluent-bit
Remove and immediately re-create
fluent-bit
service.Inspect
fluent-bit
service logs like in step 4.Re-create test service.
Inspect
fluent-bit
service logs once more.Describe the results you received:
After re-creating
fluent-bit
service it stops receiving logs from other containers until those containers are re-created.Describe the results you expected:
Docker daemon reconnects to Fluentd output and resumes sending logs of running containers.
Additional information you deem important (e.g. issue happens only occasionally):
Relevant parts of daemon logs:
For some reason it gives up faster than default
fluentd-max-retries
timesfluentd-retry-wait
(10x1sec)Relevant connections:
This stayed unchanged all the way I was writing this report, for about 30 minutes.
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.): Ubuntu 16.04 on physical server.