GoogleContainerTools / skaffold

Easy and Repeatable Kubernetes Development
https://skaffold.dev/
Apache License 2.0
15k stars 1.62k forks source link

Skaffold does not work when using remote docker over ssh #5703

Closed agorgl closed 1 year ago

agorgl commented 3 years ago

Expected behavior

Using skaffold in an environment where DOCKER_HOST=ssh://somehost is defined should work, as this is a supported docker transport method.

Actual behavior

Skaffold tries to connect over http to the given host, ignoring the 'ssh://' part

Information

Steps to reproduce the behavior

  1. Run docker on a remote host that you can also connect with ssh
  2. Set environment variable DOCKER_HOST=ssh://yourhost
  3. Run skaffold dev on a project
briandealwis commented 3 years ago

I don't see any references within the Docker client code (github.com/docker/docker/client/) and the references I found within Moby indicated that it was part of buildkit (github.com/moby/buildkit/client/llb/exec.go). Need to investigate whether this is supported upstream.

You might be able to get this to work by setting build.local.useDockerCLI: true.

agorgl commented 3 years ago

Using:

build:
  local:
    useDockerCLI: true

results in the same behavior. Keep a note that regular docker cli works as it shoulds with DOCKER_HOST set as described above.

Dev-Time commented 3 years ago

I believe I'm getting the same issue. Here's the actual error:

getting imageID for <docker image>: error during connect: Get "http://<username>%40<ssh host>/v1.24/images/<docker image>/json": dial tcp: lookup <username>@<ssh host>: no such host. Docker build ran into internal error. Please retry.
If this keeps happening, please open an issue..

Im getting this error with all combinations of build.local.useBuildkit and build.local.useDockerCLI

tboevil commented 3 years ago

Add -H tcp://<your-remote-ip> to /usr/lib/systemd/system/docker.service, and use env DOCKER_HOST=tcp://<your-remote-ip>:2375 skaffold build -v debug instead of skaffold build -v debug, it's work to me.

It seems to have something to do with this

Maybe need a PR

agorgl commented 3 years ago

This exposes docker deamon over tcp which is massively insecure.

tboevil commented 3 years ago

This exposes docker deamon over tcp which is massively insecure.

yes,it is only suitable for pure intranet environment

agorgl commented 3 years ago

Still, control over ssh channel is a way better alternative (for systems that are behind firewalls etc exposing only the ssh access) and should be a first priority support over tcp daemon

weltonrodrigo commented 3 years ago

Add -H tcp://<your-remote-ip> to /usr/lib/systemd/system/docker.service, and use env DOCKER_HOST=tcp://<your-remote-ip>:2375 skaffold build -v debug instead of skaffold build -v debug, it's work to me.

It seems to have something to do with this

Maybe need a PR

As you may know, running docker with tcp basically GIVES ANYONE WHO CAN REACH THIS PORT ROOT ACCESS TO THE MACHINE. https://docs.docker.com/go/attack-surface/

That being said:

On the remote machine:

Add an override to your systemd docker configuration:

sudo systemctl edit docker

This will open the editor, edit it so the first lines look like:

### Editing /etc/systemd/system/docker.service.d/override.conf
### Anything between here and the comment below will become the new contents of the file

[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 --containerd=/run/containerd/containerd.sock

### Lines below this comment will be discarded

Restart your docker service on the remote machine:

sudo systemctl restart docker

On your machine:

Let's say the remote machine is 192.168.0.20:

docker context create remote \
   --default-stack-orchestrator=swarm \
   --docker host=tcp://192.168.0.20:2375

Now, when you want to use the remote machine, switch context:

docker context use remote

When you want to use local docker:

docker context use default

Using skaffold with the remote docker:

Unfortunately, skaffold won't recognize the context docker is in, so you still will have to inform that in a environment variable.

DOCKER_HOST=tcp://192.168.0.20:2375 skaffold build
sstubbs commented 3 years ago

Having the same issue using docker in a LXD container and connecting to it from CLI on the host. I can just disable TLS but definitely would be a security problem if accessing remotely.

kubaw commented 2 years ago

You can forward socket file using ssh and -L option. For example:

ssh -L $HOME/remote-docker.sock:/var/run/docker.sock user@docker-remote-host.example.com

Where /var/run/docker.sock is the dockerd socket file on remote machine docker-remote-host.example.com

Then you can use this local socket to forward queries to remote dockerd:

DOCKER_HOST=unix://$HOME/remote-docker.sock skaffold dev
snickell commented 2 years ago

In case it helps @briandealwis I've tested the different combinations with a skaffold v1.34.0 on a Mac and a remote Linux DOCKER_HOST.

The only combination that didn't work was Skaffold+ssh://, with or without using buildkit in skaffold.yaml.

Skaffold + [Buildkit|Non-Buildkit] + DOCKER_HOST=ssh:: error: cannot connect to the Docker daemon Skaffold + [Buildkit|Non-Buildkit] + DOCKER_HOST=[unix|tcp]:: works docker build + [Buildkit|Non-Buildkit] + DOCKER_HOST=[unix|tcp|ssh]:: works

When using skaffold+ssh I get this error on skaffold build:

x86_64 ~/src/improc (main*) skaffold build        
Generating tags...
 - gcr.io/ceres-imaging-science/improc-notebook -> gcr.io/ceres-imaging-science/improc-notebook:a08fb2400-dirty
 - gcr.io/ceres-imaging-science/improc-hub -> gcr.io/ceres-imaging-science/improc-hub:a08fb2400
Checking cache...
 - gcr.io/ceres-imaging-science/improc-notebook: Error checking cache.
Build Failed. Cannot connect to the Docker daemon at ssh://ahi. Check if docker is running.

With skaffold build -v=trace I get (env vars ellided):

DEBU[0005] Executing template &{envTemplate 0xc000126500 0xc000062e40  } with environment map[ENV MAP ELLIDED]  subtask=-1 task=DevLoop
DEBU[0005] Executing template &{envTemplate 0xc000126700 0xc000063040  } with environment map[ENV MAP ELLIDED]  subtask=-1 task=DevLoop
DEBU[0005] Executing template &{envTemplate 0xc000126800 0xc0008de600  } with environment map[ENV MAP ELLIDED]  subtask=-1 task=DevLoop
DEBU[0005] push value not present in isImageLocal(), defaulting to false because cluster.PushImages is false  subtask=-1 task=DevLoop
DEBU[0005] FIXME: Got an status-code for which error does not match any expected type!!!: -1  module=api status_code=-1
DEBU[0007] Executing template &{envTemplate 0xc000126000 0xc000062e80  } with environment map[ENV MAP ELLIDED]  subtask=-1 task=DevLoop
DEBU[0007] push value not present in isImageLocal(), defaulting to false because cluster.PushImages is false  subtask=-1 task=DevLoop
DEBU[0007] FIXME: Got an status-code for which error does not match any expected type!!!: -1  module=api status_code=-1
 - gcr.io/ceres-imaging-science/improc-notebook: Error checking cache.
TRAC[0007] error building getting imageID for gcr.io/ceres-imaging-science/improc-notebook:a08fb2400-dirty: Cannot connect to the Docker daemon at ssh://ahi. Is the docker daemon running?  subtask=-1 task=DevLoop
DEBU[0007] Running command: [tput colors]                subtask=-1 task=DevLoop
DEBU[0007] Command output: [256
]                        subtask=-1 task=DevLoop
Build Failed. Cannot connect to the Docker daemon at ssh://ahi. Check if docker is running.
DEBU[0007] exporting metrics                             subtask=-1 task=DevLoop
DEBU[0009] metrics uploading complete in 2.093165209s    subtask=-1 task=DevLoop
snickell commented 2 years ago

This line appears to be from docker/engine in http_helpers.go (https://github.com/docker/engine/blob/8955d8da8951695a98eb7e15bead19d402c6eb27/errdefs/http_helpers.go#L103-L106): DEBU[0007] FIXME: Got an status-code for which error does not match any expected type!!!: -1 module=api status_code=-1

briandealwis commented 2 years ago

Are you using Minikube as your cluster? Does running with —kube-context='“”' make any difference?

snickell commented 2 years ago

I've tried with both minikube as my cluster, and with docker-desktop on mac's builtin kubernetes, same results

snickell commented 2 years ago

Running skaffold build --kube-context='' -vdebug also doesn't work with `DOCKER_HOST="ssh://ahi.local"

Richard87 commented 2 years ago

Hi! Just tried this myself on MacOS 12, M1 :)

useBuildkit: true

Running this: DOCKER_HOST=ssh://my-server skaffold dev builds the container perfectly (on the remote machine).

But Push fails:

could not push image "eu.gcr.io/project-000000/container:dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863": Error response from daemon: 404 page not found

But if I copy the tag name, and execute docker push eu.gcr.io/project-000000/container:dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863 on the remote machine, it works great :)

useDockerCLI

Running this: DOCKER_HOST=ssh://my-server skaffold dev builds the container perfectly (on the remote machine). (albeit way slower!)

But also failed to push the image:

Step 26/26 : RUN composer dump-autoload
 ---> Running in adc4dd4cb34f
Generating autoload files
composer/package-versions-deprecated: Generating version class...
composer/package-versions-deprecated: ...done generating version class
Generated autoload files
Removing intermediate container adc4dd4cb34f
 ---> c01310814b29
Successfully built c01310814b29
Successfully tagged eu.gcr.io/project-000000/container:dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863-dirty

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
could not push image "eu.gcr.io/project-000000/container:dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863-dirty": Error response from daemon: 404 page not found

But running docker push with DOCKER_HOST on my local mac, it works great again:

DOCKER_HOST=ssh://my-server docker push eu.gcr.io/project-000000/container:dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863-dirty
The push refers to repository [eu.gcr.io/project-000000/container]
41a0b3dd07f3: Pushed 
4098f4296bd3: Pushed 
b77dccb13112: Pushed 
8751dbdfc1e4: Pushed 
87ce9e6af300: Pushed 
9b9f4e958e8a: Pushed 
fd4b90fd0d61: Pushed 
98801bd64b2a: Pushed 
29797cb281ac: Pushed 
fdff67aae161: Pushed 
81079bec3008: Layer already exists 
bcaec337a52d: Layer already exists 
383c5282f6e3: Layer already exists 
6d6a94891fe0: Layer already exists 
6b1ac138e862: Layer already exists 
fbd2382570af: Layer already exists 
9005d2f444ff: Layer already exists 
05e34f65223f: Layer already exists 
675532b86318: Layer already exists 
b53f80c04004: Layer already exists 
069904e0a9fc: Layer already exists 
d49465772d36: Layer already exists 
6dabd9bfb6b6: Layer already exists 
86f857e82f5f: Layer already exists 
e81bff2725db: Layer already exists 
dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863-dirty: digest: sha256:17b57ed8a93beeb22636be5b58b26e511e54155ca3633c072826b438897f8199 size: 5554

In other words, also watching this issue with interrest 👍

Edit

Checking cache always fails, so manually pushing the image doesnt solve anything, since it tries to rebuild again, and fails over again...

DOCKER_HOST=ssh://my-server skaffold dev
Listing files to watch...
 - eu.gcr.io/project-000000/container
WARN[0008] k8s/*.yaml did not match any file             subtask=-1 task=DevLoop
Generating tags...
 - eu.gcr.io/project-000000/container -> eu.gcr.io/project-000000/container:dev-37c67436c17fd4f17f1e8b94d5ed4a5dadc48863-dirty
Checking cache...
 - eu.gcr.io/project-000000/container: Not found. Building
Starting build...

Edit 2:

Server: Docker Engine - Community Engine: Version: 20.10.12 API version: 1.41 (minimum version 1.12) Go version: go1.16.12 Git commit: 459d0df Built: Mon Dec 13 11:43:07 2021 OS/Arch: linux/arm64 Experimental: true containerd: Version: 1.4.12 GitCommit: 7b11cfaabd73bb80907dd23182b9347b4245eb5d runc: Version: 1.0.2 GitCommit: v1.0.2-0-g52b36a2 docker-init: Version: 0.19.0 GitCommit: de40ad0

DOCKER_HOST=ssh://my-server docker version Client: Cloud integration: v1.0.22 Version: 20.10.12 API version: 1.41 Go version: go1.16.12 Git commit: e91ed57 Built: Mon Dec 13 11:46:56 2021 OS/Arch: darwin/arm64 Context: default Experimental: true

Server: Docker Engine - Community Engine: Version: 20.10.12 API version: 1.41 (minimum version 1.12) Go version: go1.16.12 Git commit: 459d0df Built: Mon Dec 13 11:43:59 2021 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.12 GitCommit: 7b11cfaabd73bb80907dd23182b9347b4245eb5d runc: Version: 1.0.2 GitCommit: v1.0.2-0-g52b36a2 docker-init: Version: 0.19.0 GitCommit: de40ad0

weltonrodrigo commented 2 years ago

Related #7078