moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.11k stars 18.58k forks source link

Docker container stops resolve names #47414

Open fireman777 opened 4 months ago

fireman777 commented 4 months ago

Description

In RHEL 8.9 (Docker (v25.0.1), docker compose (v2.24.2) creates a container by docker-compose.yml:

version: '3.4'
services:
  <service>:
    image: <image>
    restart: always
    privileged: true
    environment:
      HTTPS_PROXY: "<HTTPS_PROXY>"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /home/\<user\>/\<folder\>/config:/etc/\<folder\>

After some time container stops resolving names:

docker exec -it <container> nslookup google.com
;; connection timed out; no servers could be reached

Reproduce

  1. Docker compose up docker-compose.yml
  2. (After some time): docker exec -it nslookup google.com

Expected behavior

Docker container doesn't stop resolving names.

docker version

Client: Docker Engine - Community
 Version:           25.0.1
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        29cf629
 Built:             Tue Jan 23 23:10:32 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          25.0.1
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       71fa3ab
  Built:            Tue Jan 23 23:09:31 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.27
  GitCommit:        a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc:
  Version:          1.1.11
  GitCommit:        v1.1.11-0-g4bccb38
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

docker info
Client: Docker Engine - Community
 Version:    25.0.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.2
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 15
  Running: 1
  Paused: 0
  Stopped: 14
 Images: 12
 Server Version: 25.0.1
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc version: v1.1.11-0-g4bccb38
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 4.18.0-513.11.1.el8_9.x86_64
 Operating System: Red Hat Enterprise Linux 8.9 (Ootpa)
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 31.22GiB
 Name: <server_name>
 ID: b0678201-93f2-4768-831c-bc35cef26a5b
 Docker Root Dir: /data/docker
 Debug Mode: false
 HTTP Proxy: http://<server_name>:80/
 HTTPS Proxy: http://<server_name>:80/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

Docker logs from container:

...
WARNING: Checking for jobs... failed    runner=<runner> status=couldn't execute POST against https://gitlab.dx1.<name>.com/api/v4/jobs/request: Post "https://gitlab.dx1.<name>.com/api/v4/jobs/request": proxyconnect tcp: dial tcp: lookup <server_name> on 127.0.0.11:53: read udp 127.0.0.1:43286->127.0.0.11:53: i/o timeout
WARNING: Checking for jobs... failed    runner=<runner> status=couldn't execute POST against https://gitlab.dx1.<name>.com/api/v4/jobs/request: Post "https://gitlab.dx1.<name>.com/api/v4/jobs/request": proxyconnect tcp: dial tcp: lookup <server_name> on 127.0.0.11:53: read udp 127.0.0.1:58358->127.0.0.11:53: i/o timeout
...
akerouanton commented 4 months ago

Hi @fireman777, thanks for reporting. Can you try turning on the daemon debug logs (see https://docs.docker.com/config/daemon/logs/#enable-debugging) and paste the log lines about DNS stuff written around the time resolution starts failing please?

fireman777 commented 4 months ago

Hi @fireman777, thanks for reporting. Can you try turning on the daemon debug logs (see https://docs.docker.com/config/daemon/logs/#enable-debugging) and paste the log lines about DNS stuff written around the time resolution starts failing please?

Hi @akerouanton, thanks for the response. Here is an attached file with docker logs. It's not in debug mode, because if I enable this mode, I should restart Docker container and after this the issue dissappers. docker_logs.txt

thaJeztah commented 4 months ago

Logs seem to indicate that the docker internal DNS is unable to connect to the DNS server(s) configured on the host;

[resolver] failed to query DNS server: 10.115.11.146:53, query: ;webproxy.hzl.mgmt.services.\tIN\t A" error="read udp 172.19.0.2:46361->10.115.11.146:53: i/o timeout
[resolver] failed to query DNS server: 10.44.139.225:53, query: ;webproxy.hzl.mgmt.services.\tIN\t A" error="read udp 172.19.0.2:53991->10.44.139.225:53: i/o timeout
fireman777 commented 4 months ago

Yes, @thaJeztah, the problem with the network doesn't allow us to resolve names. I can't ping any host from the container (previously I could do it). After the docker service restarts it disappears for some time.

akerouanton commented 4 months ago

@fireman777 Can you paste the output of docker network inspect for your network please?

fireman777 commented 4 months ago

Yes, sure (file is attached). network_inspect.txt Label "com.docker.compose.version": "1.29.2" still has a version of a previous version, but I don't think it's critical, since previously (before docker update) the problem was the same. We updated docker/docker compose to fix this problem, but it didn't help.

fireman777 commented 4 months ago

I've started docker in debug logging mode and restarted it. As soon as docker will face network connectivity issue, I'll add debug logs.