Closed far-blue closed 1 year ago
Issue #12101 very likely related.
This issue will be fixed in version 7.43.0 with https://github.com/DataDog/datadog-agent/pull/15138.
@vickenty As far as I can tell from release notes, this one didn't make it in. Can you confirm?
@grickit The fix was merged without a release note unfortunately, but it was included in the release.
I am regularly seeing errors during the handling of container removal events which then leads to no further docker events being processed (e.g. no logs are gathered for new containers) until the dd agent is restarted.
Agent Environment Example logs around the error:
Version:
Status:
Describe what happened:
I'm seeing this on average a couple of times a day on each of 12 cluster nodes. Through the day, as new services (consisting of one or more containers) are deployed and old versions of services terminated, logs for the new services stop appearing in datadog. Existing services continue to log without issue. The cause seems to be, from the logs, an error handling the shutting down of containers that results in the dd agent failing to handle further docker events until restarted.
Describe what you expected: I'd expect that dd agent should correctly handle the termination of docker containers and the identification and registration of new containers consistently.
Steps to reproduce the issue: While this issue happens frequently across all of our cluster nodes I have seen it also happen simply with datadog-agent and docker on a standard ubuntu setup and with manually started and stopped containers. I cannot, however, see a pattern of behaviour to guarantee the error is triggered.
Additional environment details (Operating System, Cloud provider, etc): I've managed to get this to happen with latest datadog-agent install and latest docker-ce install from apt on ubuntu 20.04 (with latest updates).
It may be of importance that our docker is running in subuid/subgid mode (with the config value
"userns-remap": "default"
in daemon.json). As datadog-agent doesn't handle standard json-based log processing in this docker mode (because docker writes the container data to a different folder in this mode) we also use journald as our docker log driver.Our daemon.json docker config is: