moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.06k stars 1.13k forks source link

SSH Forwarding Doesn't Release File Descriptors #3840

Closed mark-5 closed 1 year ago

mark-5 commented 1 year ago

#3431 appears to have reintroduced 1146, and buildkit is failing to release SSH agent connections until after sshforward.Copy has finished.

#3506 did not fix this, because as noted, sshprovider.ForwardAgent uses an internal wrapper that doesn't implement CloseWrite.

Environment:

macOS Ventura 13.3.1 (in moby/buildkit:local-rootless build image)

/ $ buildkitd --version
buildkitd github.com/moby/buildkit v0.11.0-rc3-470-g9d357745 9d357745feb0dfb166758b8e6632a7a875919bc3
/ $ buildctl --version
buildctl github.com/moby/buildkit v0.11.0-rc3-470-g9d357745 9d357745feb0dfb166758b8e6632a7a875919bc3
/ $ ssh -V
OpenSSH_9.1p1, OpenSSL 3.0.8 7 Feb 2023
/ $ uname -a
Linux b6ba5493f1fa 5.15.49-linuxkit #1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022 aarch64 Linux
/ $ cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.17.3
PRETTY_NAME="Alpine Linux v3.17"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

To reproduce the issue you can

  1. start an ssh-agent in debug mode
  2. run buildctl build with the Dockerfile from 1146

Here's buildctl build:

~ $ cat Dockerfile
FROM alpine
RUN --mount=type=ssh apk update \
 && apk add openssh-client-default \
 && mkdir -p -m 0600 ~/.ssh \
 && ssh-keyscan github.com >> ~/.ssh/known_hosts \
 && for i in $(seq 1 3); do \
        ssh -T git@github.com; \
    done; \
    sleep 10;
~ $ buildctl build --progress plain --frontend dockerfile.v0 --local context=. -
-local dockerfile=. --output type=tar,dest=/dev/null --ssh default --no-cache
#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 292B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.9s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [stage-0 1/2] FROM docker.io/library/alpine@sha256:124c7d2707904eea7431fffe91522a01e5a861a624ee31d03372cc1d138a3126
#4 resolve docker.io/library/alpine@sha256:124c7d2707904eea7431fffe91522a01e5a861a624ee31d03372cc1d138a3126 done
#4 sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 0B / 3.26MB 0.2s
#4 sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 3.26MB / 3.26MB 0.4s done
#4 extracting sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 0.1s done
#4 DONE 0.5s

#5 [stage-0 2/2] RUN --mount=type=ssh apk update  && apk add openssh-client-default  && mkdir -p -m 0600 ~/.ssh  && ssh-keyscan github.com >> ~/.ssh/known_hosts  && for i in $(seq 1 3); do         ssh -T git@github.com;     done;     sleep 10;
#0 0.083 fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/main/aarch64/APKINDEX.tar.gz
#5 0.535 fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/community/aarch64/APKINDEX.tar.gz
#5 0.958 v3.17.3-147-gc4fb7618739 [https://dl-cdn.alpinelinux.org/alpine/v3.17/main]
#5 0.958 v3.17.3-150-g717cba89b75 [https://dl-cdn.alpinelinux.org/alpine/v3.17/community]
#5 0.958 OK: 17691 distinct packages available
#5 1.113 (1/6) Installing openssh-keygen (9.1_p1-r2)
#5 1.360 (2/6) Installing ncurses-terminfo-base (6.3_p20221119-r0)
#5 1.402 (3/6) Installing ncurses-libs (6.3_p20221119-r0)
#5 1.443 (4/6) Installing libedit (20221030.3.1-r0)
#5 1.471 (5/6) Installing openssh-client-common (9.1_p1-r2)
#5 1.556 (6/6) Installing openssh-client-default (9.1_p1-r2)
#5 1.621 Executing busybox-1.35.0-r29.trigger
#5 1.630 OK: 13 MiB in 21 packages
#5 1.786 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.830 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.897 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.962 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.986 # github.com:22 SSH-2.0-babeld-aebb503c
#5 2.258 git@github.com: Permission denied (publickey).
#5 2.556 git@github.com: Permission denied (publickey).
#5 2.801 git@github.com: Permission denied (publickey).
#5 DONE 12.9s

#6 exporting to client tarball
#6 sending tarball 0.1s done
#6 DONE 0.1s

Here's buildctl build:

~ $ cat Dockerfile
FROM alpine
RUN --mount=type=ssh apk update \
 && apk add openssh-client-default \
 && mkdir -p -m 0600 ~/.ssh \
 && ssh-keyscan github.com >> ~/.ssh/known_hosts \
 && for i in $(seq 1 3); do \
        ssh -T git@github.com; \
    done; \
    sleep 10;
~ $ buildctl build --progress plain --frontend dockerfile.v0 --local context=. -
-local dockerfile=. --output type=tar,dest=/dev/null --ssh default --no-cache
#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 292B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.9s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [stage-0 1/2] FROM docker.io/library/alpine@sha256:124c7d2707904eea7431fffe91522a01e5a861a624ee31d03372cc1d138a3126
#4 resolve docker.io/library/alpine@sha256:124c7d2707904eea7431fffe91522a01e5a861a624ee31d03372cc1d138a3126 done
#4 sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 0B / 3.26MB 0.2s
#4 sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 3.26MB / 3.26MB 0.4s done
#4 extracting sha256:c41833b44d910632b415cd89a9cdaa4d62c9725dc56c99a7ddadafd6719960f9 0.1s done
#4 DONE 0.5s

#5 [stage-0 2/2] RUN --mount=type=ssh apk update  && apk add openssh-client-default  && mkdir -p -m 0600 ~/.ssh  && ssh-keyscan github.com >> ~/.ssh/known_hosts  && for i in $(seq 1 3); do         ssh -T git@github.com;     done;     sleep 10;
#0 0.083 fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/main/aarch64/APKINDEX.tar.gz
#5 0.535 fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/community/aarch64/APKINDEX.tar.gz
#5 0.958 v3.17.3-147-gc4fb7618739 [https://dl-cdn.alpinelinux.org/alpine/v3.17/main]
#5 0.958 v3.17.3-150-g717cba89b75 [https://dl-cdn.alpinelinux.org/alpine/v3.17/community]
#5 0.958 OK: 17691 distinct packages available
#5 1.113 (1/6) Installing openssh-keygen (9.1_p1-r2)
#5 1.360 (2/6) Installing ncurses-terminfo-base (6.3_p20221119-r0)
#5 1.402 (3/6) Installing ncurses-libs (6.3_p20221119-r0)
#5 1.443 (4/6) Installing libedit (20221030.3.1-r0)
#5 1.471 (5/6) Installing openssh-client-common (9.1_p1-r2)
#5 1.556 (6/6) Installing openssh-client-default (9.1_p1-r2)
#5 1.621 Executing busybox-1.35.0-r29.trigger
#5 1.630 OK: 13 MiB in 21 packages
#5 1.786 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.830 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.897 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.962 # github.com:22 SSH-2.0-babeld-aebb503c
#5 1.986 # github.com:22 SSH-2.0-babeld-aebb503c
#5 2.258 git@github.com: Permission denied (publickey).
#5 2.556 git@github.com: Permission denied (publickey).
#5 2.801 git@github.com: Permission denied (publickey).
#5 DONE 12.9s

#6 exporting to client tarball
#6 sending tarball 0.1s done
#6 DONE 0.1s

And here's the ssh-agent debug output, showing the file descriptors leak:

~ $ ssh-agent -s -d -a /tmp/authsock
SSH_AUTH_SOCK=/tmp/authsock; export SSH_AUTH_SOCK;
echo Agent pid 111;
debug1: new_socket: type = SOCKET
debug2: fd 3 setting O_NONBLOCK
debug1: new_socket: type = CONNECTION
debug2: fd 4 setting O_NONBLOCK
debug1: process_message: socket 1 (fd=4) type 11
debug2: process_request_identities: entering
debug2: process_request_identities: replying with 0 allowed of 0 available keys
debug1: new_socket: type = CONNECTION
debug2: fd 5 setting O_NONBLOCK
debug1: process_message: socket 2 (fd=5) type 11
debug2: process_request_identities: entering
debug2: process_request_identities: replying with 0 allowed of 0 available keys
debug1: new_socket: type = CONNECTION
debug2: fd 6 setting O_NONBLOCK
debug1: process_message: socket 3 (fd=6) type 11
debug2: process_request_identities: entering
debug2: process_request_identities: replying with 0 allowed of 0 available keys
tonistiigi commented 1 year ago

cc @sipsma

sipsma commented 1 year ago

Thanks for reporting @mark-5, I will take a look this weekend and include a test with the fix so it doesn't regress again.

sipsma commented 1 year ago

PR here: https://github.com/moby/buildkit/pull/3848