docker / compose

Define and run multi-container applications with Docker
https://docs.docker.com/compose/
Apache License 2.0
33.89k stars 5.21k forks source link

downing remote context stack works only after 3/4 attempts #8856

Closed Miosame closed 1 year ago

Miosame commented 2 years ago

I have:

ControlMaster     auto
ControlPath       ~/.ssh/control-%C
ControlPersist    yes

In my roots ssh config (as mentioned in the docs)

docker-compose --context remote up -d

works just fine, however if I then want to bring down the stack, 2 or 3 services will return a Error while Removing or Error while Stopping with the following bottom error:

error during connect: Delete "http://docker/v1.41/containers/<hash>?force=1": command [ssh -- remote docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=

stderr is empty each time - however if you repeat the down a couple more times, it eventually does bring it down no issue.

This seems related too:

I couldn't find a duplicate filed here, so I'm opening a new issue, if I should file this over at moby instead, please tell me.

Host I'm connecting from:

Docker version 20.10.10, build b485636f4b
Docker Compose version 2.0.1

Host I'm connecting to:

Docker version 20.10.10, build b485636
Docker Compose version v2.0.1
stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Miosame commented 2 years ago

.

stale[bot] commented 2 years ago

This issue has been automatically marked as not stale anymore due to the recent activity.

ephess commented 2 years ago

I'm having the exact same issue as this. It seems that this only happens in multi-container compose files for me, and each time I call down it brings down one of the containers. Something about bringing down one of the containers seems to close the connection for all the other down tasks in progress/queued.

JackelynOliveira commented 2 years ago

I'm having the same problem. I'm using docker contexts with SSH and when I attempt to run docker compose -f docker-compose.yml down it fails to remove one of the containers (yes, I have multiple containers specified on the docker-compose file). It happens everytime, it's not been random.

It does stop all the containers successfully but when it attempts to delete the last one it gives me this same error message command [ssh -- remote docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid[...].

However I was able to work around this problem by running:

docker compose -f docker-compose.yml rm --stop --force

There's a caveat though, it does not seem to delete the network this way.

CalebMacdonaldBlack commented 2 years ago

I'm also experiencing this issue

alex-kowalczyk commented 1 year ago

Observing same issue while running docker-compose with ssh context:

docker-compose --context <context> pull --ignore-pull-failures

for docker-compose with 7 images, pulled from hub.docker.com .

Almost every time a random image fails with the following error (sample failure for lucaslorentz/caddy-docker-proxy:ci-alpine ):

Pulling caddy: error during connect: Post "http://docker.example.com/v1.41/images/create?fromImage=lucaslorentz%2Fcaddy-docker-proxy&tag=ci-alpine": 
command [ssh -l root -- <HOST> docker system dial-stdio] has exited with signal: killed, 
please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=

I have docker 20.10.21 on remote server, Docker version 20.10.12 on localhost, docker-compose 2.12.2.

Pulling works flawlessly with local context.

Leo843 commented 1 year ago

I have a similar issue. Calling docker compose down on a remote host fails, however if I change the way services depend from each other, the command succeed.

My compose.yml file looks like this (with dummy images).

services:

  apple:
    image: nginx:1.23.2
  orange:
    depends_on:
      - apple
    image: nginx:1.23.2
  banana:
    depends_on:
      - orange
    image: nginx:1.23.2
  pear:
    depends_on:
      - orange
    image: nginx:1.23.2

All docker compose commands are executed with DOCKER_HOST set.

Once services have been deployed and docker compose down is called, I get the following error message.

$ DOCKER_HOST='ssh://XXXX@XXXXX:XXXXX' docker compose down
[+] Running 1/2
 ⠿ Container issue-ssh-compose-down-pear-1    Removed                                     0.3s
 ⠿ Container issue-ssh-compose-down-banana-1  Error while Removing                        0.9s
error during connect: Delete "http://docker.example.com/v1.41/containers/d2740313e4bb9c8f3ed36da5608e7924618675f0b0e85ea7b737022b97077724?force=1": command [ssh -l XXXXX -p XXXXX -- XXXXXXX docker system dial-stdio] has exited with signal: killed, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=

However, if I add a dependency in the compose.yml file (see below).

services:

  apple:
    image: nginx:1.23.2
  orange:
    depends_on:
      - apple
    image: nginx:1.23.2
  banana:
    depends_on:
      - orange
    image: nginx:1.23.2
  pear:
    depends_on:
      - orange
      # service dependency added next line
      - banana
    image: nginx:1.23.2

Once services have been deployed and docker compose down is called, the command succeeds.

$ DOCKER_HOST='ssh://XXXX@XXXXX:XXXXX' docker compose down
[+] Running 5/5
 ⠿ Container issue-ssh-compose-down-pear-1    Removed                                     0.3s
 ⠿ Container issue-ssh-compose-down-banana-1  Removed                                     0.3s
 ⠿ Container issue-ssh-compose-down-orange-1  Removed                                     0.3s
 ⠿ Container issue-ssh-compose-down-apple-1   Removed                                     0.3s
 ⠿ Network issue-ssh-compose-down_default     Removed                                     0.1s
ugal1 commented 1 year ago

@Leo843 I'm in the exact same situation, adding a dependency allows a down command . Weird thing here

samwatts98 commented 1 year ago

I'm also receiving this issue, the suggestion to use: docker compose -f docker-compose.yml rm --stop --force doesn't work either.

Client and Remote Host both use version: Docker version 20.10.17, build 100c701

edit: @Leo843's suggestion of adding dependencies worked exactly as they described. Previously, when executing: docker -c "remote-context-01" compose down --rmi all It would fail on 1 or more containers every time, and to finally get the command to finish successfully I'd need to execute it 3/4 times.

However, I tested out the following example compose file:

services:
    one:
        image: mcr.microsoft.com/mssql/server
    two:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - one
    three:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - one
    four:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - one
    five:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - one

I started the compose bundle on the remote host okay, however when executing the down command, I got the same error as usual. I then changed the dependency so that each container depended on the previous:

services:
    one:
        image: mcr.microsoft.com/mssql/server
    two:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - one
    three:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - two
    four:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - three
    five:
        image: mcr.microsoft.com/mssql/server
        depends_on:
            - four

And it works like a charm every time I test it out, all containers are stopped and removed on the first attempt.

DmitryMurinov commented 1 year ago

For Ubuntu 18 with latest stable (for date of posting this message) docker, docker-compose, etc. case reproduces. Workaround works also. But workaround is not applicable, if there are >=2 replicas for some container in docker-compose.yml. Any suggestions?

d4rkmen commented 1 year ago

Same error I have faced. In my case there were multiple items in depends_on The workaround works so far.

quentinsf commented 1 year ago

Just commenting that this is still an issue. The workaround works, but...

laurazard commented 1 year ago

I believe this was related to https://github.com/docker/compose/issues/9448 (fixed by https://github.com/docker/cli/pull/3900 which by now has been included in Compose) and shouldn't be happening in recent versions anymore.

I'm closing this issue, if anyone here is still running into this and can reproduce it please open a new issue with what version you're seeing it on, a Compose file that reproduces it and what commands trigger it.