docker / for-win

Bug reports for Docker Desktop for Windows
https://www.docker.com/products/docker#/windows
1.85k stars 289 forks source link

Windows containers fail to startup randomly (The connection with the virtual machine or container was closed) #13861

Open debug-richard opened 9 months ago

debug-richard commented 9 months ago

Description

I am working with custom Gitlab CI runners which use Docker Windows containers for testing. All hosts use Windows 11 with Docker Desktop installed. These hosts launch several hundred containers per day with multiple base images, each comprising ~25 GB with ~40 layers. Base image is always mcr.microsoft.com/windows:ltsc2019 .

Months of logging and thousands of container launches revealed that ~5% of all container launches fail randomly.
docker run shows no error message and just exits with error code -1.
If I create a file within the container (volume), it is not executed, so it will probably never be started.

The only error messages that are displayed can be found in the Windows log:

sending event [container=fa09bb2b554f0b9998d18e563bd133d6b27f91b1e23985ffa90643f6ccc4ef63 event=start event-info={fa09bb2b554f0b9998d18e563bd133d6b27f91b1e23985ffa90643f6ccc4ef63 fa09bb2b554f0b9998d18e563bd133d6b27f91b1e23985ffa90643f6ccc4ef63 2112 0 0001-01-01 00:00:00 +0000 UTC <nil>} module=libcontainerd namespace=moby]

non-zero last wait result [traceID=18f027ad612cb05269dad2b39516c179 spanID=71697c6fdfdb4195 wait-result=-1070137082]

failed to shutdown container, and subsequent terminate also failed [namespace=moby error=container fa09bb2b554f0b9998d18e563bd133d6b27f91b1e23985ffa90643f6ccc4ef63 encountered an error during hcs::System::waitBackground: failure in a Windows system call: The connection with the virtual machine or container was closed. (0xc037010a) container=fa09bb2b554f0b9998d18e563bd133d6b27f91b1e23985ffa90643f6ccc4ef63 module=libcontainerd]

Reproduce

docker run

Expected behavior

docker run should not exit with -1

docker version

Client:
 Cloud integration: v1.0.35+desktop.5
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:32:48 2023
 OS/Arch:           windows/amd64
 Context:           default

Server: Docker Desktop 4.25.0 (126437)
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.24)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:31:39 2023
  OS/Arch:          windows/amd64
  Experimental:     false

docker info

Client:
 Version:    24.0.6
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2-desktop.5
    Path:     C:\Program Files\Docker\cli-plugins\docker-buildx.exe
  compose: Docker Compose (Docker Inc.)
    Version:  v2.23.0-desktop.1
    Path:     C:\Program Files\Docker\cli-plugins\docker-compose.exe
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-dev.exe
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.20
    Path:     C:\Program Files\Docker\cli-plugins\docker-extension.exe
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.9
    Path:     C:\Program Files\Docker\cli-plugins\docker-init.exe
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-sbom.exe
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-scan.exe
  scout: Docker Scout (Docker Inc.)
    Version:  v1.0.9
    Path:     C:\Program Files\Docker\cli-plugins\docker-scout.exe

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 24.0.6
 Storage Driver: windowsfilter
  Windows:
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics internal l2bridge l2tunnel nat null overlay private transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
 Swarm: inactive
 Default Isolation: hyperv
 Kernel Version: 10.0 22621 (22621.1.amd64fre.ni_release.220506-1250)
 Operating System: Microsoft Windows Version 22H2 (OS Build 22621.2134)
 OSType: windows
 Architecture: x86_64
 CPUs: 20
 Total Memory: 31.68GiB
 Name: a1-runner
 ID: 86c1d550-9bb9-41d1-b6e1-bfd5663861b3
 Docker Root Dir: C:\ProgramData\Docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Diagnostics ID

0

Additional Info

No response

debug-richard commented 3 months ago

This still happens with 4.30.0.