docker / compose

Define and run multi-container applications with Docker
https://docs.docker.com/compose/
Apache License 2.0
33.85k stars 5.21k forks source link

[BUG] When using DOCKER_HOST=ssh://macos.local get `FIXME: Got a status-code for which error does not match any expected type!!! error="error during connect:...` #10421

Open jamshid opened 1 year ago

jamshid commented 1 year ago

Description

Get that weird error sometimes even from simply docker-compose stop when using an ssh DOCKER_HOST connection to a Mac running Docker for Desktop. It normally works but sometimes gets into a state like this where every docker-compose operation returns that error.

This has been happening for past few years through all releases.

I've tried with and without the .ssh/config suggestions in https://docs.docker.com/engine/security/protect-access/, which do help on linux, but don't seem to help with macOS.

Is there any way to get more information about what is causing the problem? Can the FIXME output from --debug be improved to show more details?

% docker-compose --debug --ansi never stop --timeout 240
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
Container foo-test-1  Stopping
Container foo-socatetcd-1  Stopping
Container foo-socatbarapi-1  Stopping
Container foo-barcentos-1  Stopping
Container foo-socatbarscsp-1  Stopping
Container foo-socatbarconsole-1  Stopping
Container foo-s3ql-1  Stopping
Container foo-https-1  Stopping
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
Container foo-grafana-1  Stopping
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
Container foo-dnsmasq-1  Stopping
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
Container foo-elasticsearchexporter-1  Stopping
Container foo-prometheus-1  Stopping
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
DEBU[0000] commandconn: starting ssh with [-- 192.168.1.50 docker system dial-stdio] 
Container foo-socatbarscsp-1  Stopped
Container foo-socatbarconsole-1  Stopped
Container foo-socatbarapi-1  Stopped
Container foo-test-1  Stopped
DEBU[0000] FIXME: Got an status-code for which error does not match any expected type!!!  error="error during connect: Post \"http://docker.example.com/v1.41/containers/a1d7360a7d10d2edeb3c89e5132434711df92e5c67e46a42266e6cf82d56dbe6/stop?t=240\": command [ssh -- 192.168.1.50 docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=" module=api status_code=-1
Container foo-s3ql-1  Error while Stopping
DEBU[0000] FIXME: Got an status-code for which error does not match any expected type!!!  error="error during connect: Post \"http://docker.example.com/v1.41/containers/af28133c15a27eff2597b3df21bb5caee108171599961d87b983d37ff2567b29/stop?t=240\": command [ssh -- 192.168.1.50 docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=" module=api status_code=-1
Container foo-elasticsearchexporter-1  Error while Stopping
DEBU[0000] FIXME: Got an status-code for which error does not match any expected type!!!  error="error during connect: Post \"http://docker.example.com/v1.41/containers/b2625a418431cda19410df3205e4b6afe6102317be94d6b4b3fdd3b9e168fb84/stop?t=240\": command [ssh -- 192.168.1.50 docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=" module=api status_code=-1
Container foo-prometheus-1  Error while Stopping
Container foo-barcentos-1  Stopped
Container foo-grafana-1  Stopped
Container foo-socatetcd-1  Stopped
Container foo-etcd-1  Stopping
Container foo-https-1  Stopped
Container foo-dnsmasq-1  Stopped
DEBU[0000] FIXME: Got an status-code for which error does not match any expected type!!!  error="error during connect: Post \"http://docker.example.com/v1.41/containers/ef706bf48da05ce331791d81d8910ee36828a3c6e7ad93a51050ea3cabdda054/stop?t=240\": command [ssh -- 192.168.1.50 docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=" module=api status_code=-1
Container foo-etcd-1  Error while Stopping
error during connect: Post "http://docker.example.com/v1.41/containers/a1d7360a7d10d2edeb3c89e5132434711df92e5c67e46a42266e6cf82d56dbe6/stop?t=240": command [ssh -- 192.168.1.50 docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=

Steps To Reproduce

Sorry I don't have an easily reproducible scenario. It's a compose file with several services, some of them scaled. I use DOCKER_HOST=ssh:// pointing to my Mac mini (up to date OS/Docker on both my MacBook and the mini). I'm happy to run with more debug options or a special build with logging if needed.

Compose Version

% docker-compose --version
Docker Compose version v2.15.1

### Docker Environment

```Text
% docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.10.3)
  compose: Docker Compose (Docker Inc., v2.15.1)
  dev: Docker Dev Environments (Docker Inc., v0.1.0)
  extension: Manages Docker extensions (Docker Inc., v0.2.18)
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
  scan: Docker Scan (Docker Inc., v0.25.0)
  scout: Command line tool for Docker Scout (Docker Inc., v0.6.0)

Server:
 Containers: 58
  Running: 11
  Paused: 0
  Stopped: 47
 Images: 995
 Server Version: 20.10.23
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay weaveworks/net-plugin:latest_release
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active
  NodeID: tc29os4as78dj8equuf69ziv5
  Is Manager: true
  ClusterID: qm7qjt5ky6wf24930sb9slgo7
  Managers: 1
  Nodes: 1
  Default Address Pool: 10.0.0.0/8  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 192.168.65.3
  Manager Addresses:
   192.168.65.3:2377
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2456e983eb9e37e47538f59ea18f2043c9a73640
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.15.49-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 37.22GiB
 Name: docker-desktop
 ID: JAXE:5DQC:INLA:3I6J:I5RZ:ZYL7:VJLQ:GBNO:SRJ4:ARNO:BWHO:IJNT
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: true
 Insecure Registries:
  hubproxy.docker.internal:5000
  192.168.1.50:3333
  192.168.1.50:5100
  docker-repo.tx.foo.com
  127.0.0.0/8
 Live Restore Enabled: false


### Anything else?

_No response_
ndeloof commented 1 year ago

seems similar to https://github.com/docker/compose/issues/10117