docker / cli

The Docker CLI
Apache License 2.0
4.94k stars 1.93k forks source link

docker.example.com is down, so docker compose doesn't work anymore #4221

Open crawlchange opened 1 year ago

crawlchange commented 1 year ago

Description

Doing anything using docker compose on a clean server breaks because "docker.example.com" is down, and, unfortunately, it seems that docker compose does not work without that website (which is already bad).

error during connect: Get "http://docker.example.com/v1.24/version": command [ssh -l root -- ip docker system dial-stdio] has exited with exit status 255, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=ssh: connect to host ip port 22: Connection refused

Steps To Reproduce

-Use the appropriate context to point to the machine

-Run "docker compose build"

Compose Version

Docker Compose version v2.17.2

Docker Environment

Docker version 23.0.4, build f480fb1

Anything else?

No response

crawlchange commented 1 year ago

Update: upon further investigation (I should call it "extreme guessing" instead), I figured that the issue was caused by infrastructure resource limitations.

It is important to notice that the error message is completely nonsensical, and does not point at all to the actual issue.

ndeloof commented 1 year ago

This domain name is set by docker/cli as a default host name: https://github.com/docker/cli/blob/master/cli/connhelper/connhelper.go#L66

This is indeed source of confusion when the get eventually printed as an error message, and should be replaced by some more generic name, like docker_host.

@thaJeztah could you please transfert this issue to docker/cli, I don't have required privileges

thaJeztah commented 1 year ago

I'd have to look up history, but I guess we used this one because "a" domain is needed for this (but "anything goes"). The example.com was probably used to prevent any possibility of hitting an actual domain (and maybe localhost was confusing for this feature, as it would be a remote server).

Looking at https://www.rfc-editor.org/rfc/rfc2606.html#section-2

Perhaps something like docker.localhost or api.localhost could still work (suggestions welcome!)

crawlchange commented 1 year ago

In my particular case, the actual issue generating that error message was a digital ocean droplet without enough resources. Presumably, the server shut down connections.

Docker's response should probably be something like "the remote connection was shut down", instead of that message. That would have saved many hours, since that message led me to completely unrelated issues.

So, the problem is not that "example.com" was being used instead of something like "docker.localhost". The problem is that there was absolutely no context to try a connection to a default hostname. An actual hostname was provided, the connection was indeed established, and the build process began on that remote machine. But the build process broke because it used too many resources and the connection was shut down.

laurazard commented 1 year ago

Right, thanks for the context! The last part of that error is actually very relevant since it includes the ssh command that the CLI used to connect to the remote host and what the error returned by that command was (ssh: connect to host ip port 22: Connection refused), which seems like it would be the most important for a user trying to debug an issue with their setup. The issue seems to be that the beginning is misleading, as we're not actually making an HTTP Get call to some random host, but I'm working on some of the remote connection/ssh stuff, I'll try to make the error handling a bit better :)