moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.2k stars 1.16k forks source link

"rpc error" with dockerfile:1.10.0 #5502

Open dmytro-GL opened 4 days ago

dmytro-GL commented 4 days ago

I'm not quite sure whether this sort of compatibility issues is expected, and whether buildkit is to blame, but my devspace environment got broken today after I recreated it from scratch.

Here is a minimal example:

$ cat app/Dockerfile 
# syntax=docker/dockerfile:1.10.0
FROM alpine
$ cat devspace.yaml 
version: v2beta1
name: app
images:
  app:
    image: docker.io/local/app
    dockerfile: ./app/Dockerfile
    context: ./app
    buildKit:
      inCluster: {}
$ devspace build 
info Using namespace 'default'
info Using kube context 'minikube'
build:app Rebuild image docker.io/local/app because tag is missing
build:app Building image 'docker.io/local/app:HlMFbak' with engine 'buildkit'
build:app Execute BuildKit command with: docker buildx build --tag docker.io/local/app:HlMFbak --load --file Dockerfile --builder devspace-default -
build:app Sending build context to Docker daemon  2.048kB
build:app #0 building with "devspace-default" instance using kubernetes driver
build:app 
build:app #1 [internal] load remote build context
build:app #1 CACHED
build:app 
build:app #2 copy /context /
build:app #2 CACHED
build:app 
build:app #3 resolve image config for docker-image://docker.io/docker/dockerfile:1.10.0
build:app #3 DONE 0.7s
build:app 
build:app #4 docker-image://docker.io/docker/dockerfile:1.10.0@sha256:865e5dd094beca432e8c0a1d5e1c465db5f998dca4e439981029b3b81fb39ed5
build:app #4 resolve docker.io/docker/dockerfile:1.10.0@sha256:865e5dd094beca432e8c0a1d5e1c465db5f998dca4e439981029b3b81fb39ed5 0.0s done
build:app #4 CACHED
build:app 
build:app #5 [internal] load remote build context
build:app #5 ERROR: rpc error: code = Unknown desc = no http response from session for sniolg7ur0t5wky34l2awvfcg
build:app ------
build:app  > [internal] load remote build context:
build:app ------
build:app Dockerfile:1
build:app --------------------
build:app    1 | >>> # syntax=docker/dockerfile:1.10.0
build:app    2 |     FROM alpine
build:app    3 |     
build:app --------------------
build:app ERROR: failed to solve: failed to read downloaded context: failed to load cache key: rpc error: code = Unknown desc = no http response from session for sniolg7ur0t5wky34l2awvfcg
build_images: build images: error building image docker.io/local/app:HlMFbak: error executing 'docker buildx build --tag docker.io/local/app:HlMFbak --load --file Dockerfile --builder devspace-default -': 
fatal exit status 1

It works fine if I either switch to buildkit:v0.16.0:

+      inCluster:
+        image: moby/buildkit:v0.16.0

or switch to docker/dockerfile:1.11.0.

I managed to narrow it down to this commit: https://github.com/moby/buildkit/pull/5342/commits/1a3fc0aa15273eb7f262e0d8336702b10c942283

$ cat test.sh 
commit="$(git rev-parse "$1")"

git checkout "$commit"
docker build -t buildkit:"$commit" .
docker buildx create --name buildkit-"$commit" --driver docker-container --driver-opt image=buildkit:"$commit"

mkdir -p /tmp/t
cat > /tmp/t/Dockerfile <<EOF
# syntax=docker/dockerfile:1.10.0
FROM alpine
EOF

tar --directory=/tmp/t --create . | docker buildx build --debug --no-cache --builder buildkit-"$commit" --load -

docker builder rm buildkit-"$commit"
# BAD
sh test.sh 1a3fc0aa15273eb7f262e0d8336702b10c942283
[...]
#6 [internal] load remote build context
#6 ERROR: rpc error: code = Unknown desc = no http response from session for 2788r3fv3cd9t3g6zz1q53te7
------
 > [internal] load remote build context:
------
Dockerfile:1
--------------------
   1 | >>> # syntax=docker/dockerfile:1.10.0
   2 |     FROM alpine
   3 |     
--------------------
ERROR: failed to solve: failed to read downloaded context: failed to load cache key: rpc error: code = Unknown desc = no http response from session for 2788r3fv3cd9t3g6zz1q53te7
7 v0.16.0-rc2-49-g1a3fc0aa1 buildkitd --allow-insecure-entitlement=network.host
github.com/moby/buildkit/frontend/gateway.(*llbBridgeForwarder).Return
        /src/frontend/gateway/gateway.go:1022
github.com/moby/buildkit/control/gateway.(*GatewayForwarder).Return
        /src/control/gateway/gateway.go:146
github.com/moby/buildkit/frontend/gateway/pb._LLBBridge_Return_Handler.func1
        /src/frontend/gateway/pb/gateway_grpc.pb.go:473
google.golang.org/grpc.getChainUnaryHandler.func1
        /src/vendor/google.golang.org/grpc/server.go:1211
main.unaryInterceptor
        /src/cmd/buildkitd/main.go:711
google.golang.org/grpc.NewServer.chainUnaryServerInterceptors.chainUnaryInterceptors.func1
        /src/vendor/google.golang.org/grpc/server.go:1202
github.com/moby/buildkit/frontend/gateway/pb._LLBBridge_Return_Handler
        /src/frontend/gateway/pb/gateway_grpc.pb.go:475
google.golang.org/grpc.(*Server).processUnaryRPC
        /src/vendor/google.golang.org/grpc/server.go:1394
google.golang.org/grpc.(*Server).handleStream
        /src/vendor/google.golang.org/grpc/server.go:1805
google.golang.org/grpc.(*Server).serveStreams.func2.1
        /src/vendor/google.golang.org/grpc/server.go:1029
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1695
[...]

Full output: https://gist.github.com/dmytro-GL/03010a4e9fa6cf658e1de3d8763af0fc

# GOOD
sh test.sh 1a3fc0aa15273eb7f262e0d8336702b10c942283~1
[...]
#6 [internal] load metadata for docker.io/library/alpine:latest
#6 DONE 1.2s

#7 [1/1] FROM docker.io/library/alpine:latest@sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d
#7 resolve docker.io/library/alpine:latest@sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d 0.0s done
#7 DONE 0.0s

#8 exporting to docker image format
#8 exporting layers done
#8 exporting manifest sha256:03a4296887d957128c708726cb56aa0eb268ef8e398705c83cabbb13671c6d71 0.0s done
#8 exporting config sha256:e75ff1d806b99c41ff0f797f6800a85546d5ebf447d69796ec1db166809a41a2 0.0s done
#8 sending tarball 0.0s done
#8 DONE 0.4s

#7 [1/1] FROM docker.io/library/alpine:latest@sha256:beefdbd8a1da6d2915566fde36db9db0b524eb737fc57cd1367effd16dc0d06d
#7 sha256:43c4264eed91be63b206e17d93e75256a6097070ce643c5e8f0379998b44f170 3.62MB / 3.62MB 0.3s done
#7 DONE 0.3s

#9 importing to docker
#9 DONE 0.0s
[...]

Full output: https://gist.github.com/dmytro-GL/d2c8a5f307a1477e7299faf5f5322cc0

Docker information:

$ docker version
Client:
 Version:           27.3.1
 API version:       1.47
 Go version:        go1.23.2
 Git commit:        v27.3.1
 Built:             Thu Jan  1 00:00:00 1970
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          27.3.1
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.23.2
  Git commit:       v27.3.1
  Built:            Tue Jan  1 00:00:00 1980
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.23
  GitCommit:        v1.7.23
 runc:
  Version:          1.1.14
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        
 rootlesskit:
  Version:          2.3.1
  ApiVersion:       1.1.1
  NetworkDriver:    slirp4netns
  PortDriver:       builtin
  StateDir:         /run/user/1000/dockerd-rootless
 slirp4netns:
  Version:          1.3.1
  GitCommit:        e5e368c4f5db6ae75c2fce786e31eef9da6bf236
tonistiigi commented 4 days ago

@jsternberg PTAL

1a3fc0aa15273eb7f262e0d8336702b10c942283 is the first bad commit
commit 1a3fc0aa15273eb7f262e0d8336702b10c942283
Author: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
Date:   Thu Sep 26 11:47:07 2024 -0500

    protobuf: remove gogoproto

    Remove gogoproto in favor of the standard protobuf compiler. This
    removes any nonstandard extensions that were part of gogoproto such as
    the custom types.

    Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
jsternberg commented 1 day ago

This seems to happen when:

  1. A syntax line is used (image doesn't matter).
  2. The context is sent through stdin.

Buildx client doesn't seem to matter. Buildkit version does seem to matter. It seems to be related to something with the upload provider used by the session. The older version seems to have only called Pull once while the new one seems to call it multiple times. Still investigating.

jsternberg commented 1 day ago

We've also determined that this doesn't happen with syntax version 1.11.0. These are still intended to be compatible and I'm still trying to figure out why this is happening.