We use the standard docker_logs source and communicate to the docker socket
# https://vector.dev/docs/reference/sources/docker_logs/
# Collects all docker logs on host through /var/run/docker.sock
[sources.docker]
type = "docker_logs"
exclude_containers = [list of excluded containers]
Debug Output
Expected Behavior
Vector should not attempt to watch logs of container_ids that no longer exist
Actual Behavior
While vector is running, a very short lived pod (that actually crash loops) starts.
Vector will show it starts watching this log
Aug 25 18:25:35.783 INFO vector::internal_events::docker_logs: Stopped watching for container logs. container_id=62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2
Aug 25 18:25:35.783 INFO vector::internal_events::docker_logs: Started watching for container logs. container_id=62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2
Watching docker ps -a I can see the pod exited N minutes ago, however ONCE docker ps -a no longer shows any containers for 62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2
We will begin seeing Communication Errors on that container_id
Aug 25 18:34:58.973 ERROR vector::internal_events::docker_logs: Internal log [Error in communication with Docker daemon.] has been rate limited 4432 times.
Aug 25 18:34:58.973 ERROR vector::internal_events::docker_logs: Error in communication with Docker daemon. error=DockerResponseNotFoundError { message: "{\"message\":\"No such container: 62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2\"}\n" } container_id=Some("62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2") internal_log_rate_secs=10
Aug 25 18:34:58.973 ERROR vector::internal_events::docker_logs: Internal log [Error in communication with Docker daemon.] is being rate limited.
You will also see in the vector logs that it is spamming Start/Stop watching error messages for that container id (and other container id's that have had the same problem before).
Its as if vector is not refreshing its list of ContainerIDs it should be monitoring and this communication errors just pile up
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Started watching for container logs. container_id=9396ebf07903e56f00377175c5f1253942267e64e259db00144dea18cd545b33
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Stopped watching for container logs. container_id=9729d1b5085421a4f92c7e6ee39cdc83bd2cb64d479047935424fcad862380fa
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Started watching for container logs. container_id=9729d1b5085421a4f92c7e6ee39cdc83bd2cb64d479047935424fcad862380fa
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Stopped watching for container logs. container_id=62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Started watching for container logs. container_id=62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Stopped watching for container logs. container_id=1bc6ac9eab36b841ae694762a29ec90e955dd7363206ea7209e9a6f6960a792e
Aug 25 19:24:53.790 INFO vector::internal_events::docker_logs: Started watching for container logs. container_id=1bc6ac9eab36b841ae694762a29ec90e955dd7363206ea7209e9a6f6960a792e
These communication errors do not seem to ever stop until we restart the vector container
Example Data
Additional Context
Vector runs as a daemonset in kubernetes. I have not noticed this issue from any other crash looped container. Perhaps it is related to how shortlived the container is?
The pod itself has multiple containers running in it and it is only 1 of the multiple containers that are crashlooping, in other words the pod is at like a 5/6 count of containers being ready if that makes a difference
Vector Version
Vector Configuration File
We use the standard
docker_logs
source and communicate to the docker socketDebug Output
Expected Behavior
Vector should not attempt to watch logs of container_ids that no longer exist
Actual Behavior
While vector is running, a very short lived pod (that actually crash loops) starts.
Vector will show it starts watching this log
Watching
docker ps -a
I can see the podexited N minutes ago
, however ONCEdocker ps -a
no longer shows any containers for62a39bfabee39106099e23cc5a5db6df97098cbcd4a9282b73eb51f16046b4a2
We will begin seeing Communication Errors on that container_id
You will also see in the vector logs that it is spamming Start/Stop watching error messages for that container id (and other container id's that have had the same problem before).
Its as if vector is not refreshing its list of ContainerIDs it should be monitoring and this communication errors just pile up
These communication errors do not seem to ever stop until we restart the vector container
Example Data
Additional Context
Vector runs as a daemonset in kubernetes. I have not noticed this issue from any other crash looped container. Perhaps it is related to how shortlived the container is?
The pod itself has multiple containers running in it and it is only 1 of the multiple containers that are crashlooping, in other words the pod is at like a
5/6
count of containers being ready if that makes a differenceReferences