superfly / flyctl

Command line tools for fly.io services
https://fly.io
Apache License 2.0
1.41k stars 235 forks source link

Look for docker socket in ~/.docker/run/docker.sock #1939

Open gwuah opened 1 year ago

gwuah commented 1 year ago

Originally spotted in the community forum https://community.fly.io/t/flyctl-cannot-find-local-docker-instance/11649

Summary New versions of docker-for-mac don't run as root anymore, so they're unable to create symlinks for /var/run/docker.sock. We need to start looking inside ~/.docker/run/docker.sock when we're trying to locate the daemon.

Note This needs some extra triaging since we could probably add this feature to the docker client itself and upgrade our docker client version (if they accept the PR of course). Also it's possible they've fixed it already and all we have to do is to upgrade the client version. Or we could tell our client where to look (if that's possible)

lorenzocattaneo commented 1 year ago

Hi, I'm the one that posted on the community forums. I'm no go expert so I'm not the best suited for a PR, but I've taken the time to at least investigate the issue a bit further.

Docker Engine APIs do not seem to have any context related endpoints, I've checked the docs here: https://docs.docker.com/engine/api/v1.42/.

Searching for context, host or endpoint yields no results.

In the docker Go SDK this is also not handled, here's the code for client creation. The default host is used, which is hardcoded to the default unix socket.

From a terminal it is fairly easy to determine where to look, running docker context inspect inspects the currently active context and shows the socket location. Example result:

[
    {
        "Name": "desktop-linux",
        "Metadata": {},
        "Endpoints": {
            "docker": {
                "Host": "unix:///Users/<username>/.docker/run/docker.sock",
                "SkipTLSVerify": false
            }
        },
        "TLSMaterial": {},
        "Storage": {
            "MetadataPath": "/Users/<username>/.docker/contexts/meta/fe9c6bd7a66301f49ca9b6a70b217107cd1284598bfc254700c989b916da791e",
            "TLSPath": "/Users/<username>/.docker/contexts/tls/fe9c6bd7a66301f49ca9b6a70b217107cd1284598bfc254700c989b916da791e"
        }
    }
]

Again, I never really used go so this is just a very basic example, I'm sorry if it's bad quality. Anyway, one could figure out the value like this:

details, err := exec.Command("docker", "context", "inspect").Output()
if err != nil {
    //manage error
}
var j []DockerContext //omitted the structs but it's trivial to map the above json
err = json.Unmarshal(details, &j)
if err != nil {
    //manage error
}
host := j[0].Endpoints.Docker.Host //here's the current active host

//the SDK allows to define a host when building the client
hostOpt := client.WithHostFromEnv() //check if a host is defined as env variable first
if hostOpt == nil {
    hostOpt = client.WithHost(host) //if not use context host
}
cli, err := client.NewClientWithOpts(hostOpt, client.WithAPIVersionNegotiation())

The relevant point in flyctl is here

imgsrc.NewLocalDockerClient() builds a client with the default host, but it could receive the current host as a parameter and then attempt to Ping. Alternatively the client could be instantiated with different common endpoints and attempt a ping for each, but it's less resilient. With the first solution if the user configures the socket in a different location it would still work.

scriptjs commented 1 year ago

It is important to note that docker-for-mac is an implementation of docker for the mac. Rancher Desktop is another, as an example, that also brings kubernetes based on k3s that many are using. The point being – there are apt to be differences so this should be considered when it comes to changes that could assume docker-for-mac is how docker is implemented on the system.

andig commented 1 year ago

Any chance to get this fixed? Happens again after every docker version bump... Would be happy to do a pr. Even simpler without json handling:

docker context inspect -f '{{ .Endpoints.docker.Host }}'
unix:///Users/andig/.docker/run/docker.sock
andig commented 1 year ago

Unfortunately that doesn't suffice:

../flyctl/bin/flyctl deploy --local-only
==> Verifying app config
Validating /Users/.../htdocs/fly.toml
Platform: nomad
✓ Configuration is valid
--> Verified app config
==> Building image

host unix:///Users/.../.docker/run/docker.sock <----- correct

WARN failed to create build in graphql: input:3: createBuild Could not find App

Error: failed to fetch an image or build from source: no docker daemon available
andig commented 1 year ago

It seems that there is an issue with connecting to user's socket:

Cannot connect to the Docker daemon at unix:

This happens when pinging the new client. No idea why that happens. The host appears to be parsed correctly:

&{scheme: host:unix:///Users/.../.docker/run/docker.sock
 proto:unix addr:/Users/andig/.docker/run/docker.sock
 basePath: client:0x14000925020 version:1.41 customHTTPHeaders:map[] manualOverride:false negotiateVersion:true negotiated:false}
lorenzocattaneo commented 1 year ago

@andig just in case your PR doesn't get fixed or merged, since 4.19 (or 4.20?) in Docker Desktop you can set the default socket location again.

Screenshot 2023-06-08 alle 15 04 58

Anyway, seems to me like the command you are running appends a \n as a last character. You can see in the log you pasted how there's a new line between the host and proto. Something like

clean_host := strings.ReplaceAll(string(host), "\n", "")

could solve your problems in #2413

andig commented 1 year ago

Yikes, fixed! Thank you for noticing. I don't see the Docker change in 4.19, so might really be 4.20.

redjonzaci commented 1 year ago

I wonder if we still need this issue or if your PR solves it @andig?

andig commented 1 year ago

Fixed for me after setting Docker to use the global path again. Still- I think users explicitly need to enable that.

anacrolix commented 11 months ago

Doesn't work persistently for me, Docker 4.21.1.