submariner-io / submariner

Networking component for interconnecting Pods and Services across Kubernetes clusters.
https://submariner.io
Apache License 2.0
2.43k stars 193 forks source link

`make clusters` fails #1595

Closed davidohana closed 2 years ago

davidohana commented 3 years ago

What happened: Getting errors when running make clusters or make deploy as part of the getting started guide for kind.

What you expected to happen: Kind clusters should be created.

How to reproduce it (as minimally and precisely as possible): Tried both on CentOS 7.8 and macOS 11.6

Anything else we need to know?:

kubectl, subctl, docker, kind already installed properly.

CentOS output:

$ make clusters
Makefile:126: Makefile.dapper: No such file or directory
Downloading Makefile.dapper
Downloading dapper
Submariner Dapper
Downloading Dockerfile.dapper
docker network create -d bridge kind
Error response from daemon: network with name kind already exists
make: [clusters] Error 1 (ignored)
./.dapper   -- make --debug=b clusters
Sending build context to Docker daemon   9.09MB
Step 1/8 : ARG BASE_BRANCH
Step 2/8 : FROM quay.io/submariner/shipyard-dapper-base:${BASE_BRANCH}
 ---> 1a2513f0257b
Step 3/8 : ARG PROJECT
 ---> Using cache
 ---> b14ff42e2fb8
Step 4/8 : ENV DAPPER_ENV="QUAY_USERNAME QUAY_PASSWORD CLUSTERS_ARGS DEPLOY_ARGS CLEANUP_ARGS E2E_ARGS RELEASE_ARGS MAKEFLAGS FOCUS SKIP PLUGIN E2E_TESTDIR GITHUB_USER GITHUB_TOKEN"     DAPPER_SOURCE=/go/src/github.com/submariner-io/${PROJECT} DAPPER_DOCKER_SOCKET=true
 ---> Using cache
 ---> 75002b8b7e4d
Step 5/8 : ENV DAPPER_OUTPUT=${DAPPER_SOURCE}/output
 ---> Using cache
 ---> 9a74234220de
Step 6/8 : WORKDIR ${DAPPER_SOURCE}
 ---> Using cache
 ---> 48d3b8677841
Step 7/8 : ENTRYPOINT ["/opt/shipyard/scripts/entry"]
 ---> Using cache
 ---> 7c3f8ac6cc15
Step 8/8 : CMD ["sh"]
 ---> Using cache
 ---> 88b117083c88
Successfully built 88b117083c88
Successfully tagged submariner:devel
[submariner]$ trap chown -R 1001:1001 . exit
[submariner]$ mkdir -p bin dist output
[submariner]$ make --debug=b clusters
GNU Make 4.3
Built for x86_64-redhat-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Reading makefiles...
make: cmp: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
Updating makefiles....
Updating goal targets....
 File 'clusters' does not exist.
Must remake target 'clusters'.
/opt/shipyard/scripts/clusters.sh --settings /go/src/github.com/submariner-io/submariner/.shipyard.e2e.yml
make: /bin/bash: Operation not permitted
make: *** [/opt/shipyard/Makefile.inc:106: clusters] Error 127
[submariner]$ make --debug=b clusters
make: *** [clusters] Error 2

Mac output:

% make clusters
Makefile:126: Makefile.dapper: No such file or directory
Downloading Makefile.dapper
Downloading dapper
Submariner Dapper
Downloading Dockerfile.dapper
docker network create -d bridge kind
Error response from daemon: network with name kind already exists
make: [clusters] Error 1 (ignored)
./.dapper   -- make --debug=b clusters
[+] Building 10.2s (3/3) FINISHED
 => [internal] load build definition from Dockerfile.dapper                                                                                                                                        0.1s
 => => transferring dockerfile: 524B                                                                                                                                                               0.0s
 => [internal] load .dockerignore                                                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                                                    0.0s
 => ERROR [internal] load metadata for quay.io/submariner/shipyard-dapper-base:devel                                                                                                              10.0s
------
 > [internal] load metadata for quay.io/submariner/shipyard-dapper-base:devel:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to do request: Head https://quay.io/v2/submariner/shipyard-dapper-base/manifests/devel: Bad Gateway
sed: 1: "/"DAPPER_CP=/{s/[[:spac ...": extra characters at the end of q command
sed: 1: "/"DAPPER_ENV=/{s/[[:spa ...": extra characters at the end of q command
sed: 1: "/"DAPPER_SOURCE=/{s/[[: ...": extra characters at the end of q command
sed: 1: "/"DAPPER_DOCKER_SOCKET= ...": extra characters at the end of q command
sed: 1: "/"DAPPER_RUN_ARGS=/{s/[ ...": extra characters at the end of q command
docker: invalid reference format: repository name must be lowercase.
See 'docker run --help'.
make: *** [clusters] Error 125

Environment:

Would be helpful also to have docs on what is the required configuration for kind clusters when running kind CLI manually not using make file? (CIDRs, CNI, nodes, etc..) Tried also to setup kind clusters manually with non-overlapping CIDRs but gateways on both clusters cannot connect. Pardon if this bug report is due to a newcomer's mistake.

skitt commented 3 years ago

Thanks for reporting this, this isn’t due to a newcomer’s mistake!

skitt commented 3 years ago

I don’t have a Mac to reproduce the error there, but I think this should fix it; could you try it and let me know?

Change line 61 in .dapper inside your submariner clone to read

    docker inspect "$1" | sed -n -E "/\"$2=/{s/[[:space:]]+\"$2=([^\"]+)\",/\1/;p} | head -n 1"
skitt commented 3 years ago

On CentOS, are you using Docker, or Podman?

davidohana commented 3 years ago

@skitt The fix you suggested didn't change the error output on macOS. However, I crafted another sed instruction in extract_var() that seems to work both on my mac and on CentOS:

docker inspect "$1" | grep $2 | sed -E "s/.*\"$2=(.*)\",?/\1/;q"

(I can send a PR if you want..)

On CentOS, this of course does not fix the other error which is later in the execution workflow. Docker is installed on that machine:

Docker version 19.03.13, build 4484c46d9d
skitt commented 3 years ago

However, I crafted another sed instruction in extract_var() that seems to work both on my mac and on CentOS:

docker inspect "$1" | grep $2 | sed -E "s/.*\"$2=(.*)\",?/\1/;q"

(I can send a PR if you want..)

Yes please! With double quotes in grep "$2" ;-).

davidohana commented 3 years ago

PR: https://github.com/submariner-io/shipyard/pull/684

davidohana commented 3 years ago

Tried also Ubunto 18.04, same failure as CentOS:

% make clusters
Makefile:126: Makefile.dapper: No such file or directory
Downloading Makefile.dapper
Downloading dapper
Submariner Dapper
Downloading Dockerfile.dapper
docker network create -d bridge kind
af765c07730d2d737ab9cef01ff01dd8a7e6d919279487bff4a5f21e362536d9
./.dapper   -- make --debug=b clusters
Sending build context to Docker daemon  9.104MB
Step 1/8 : ARG BASE_BRANCH
Step 2/8 : FROM quay.io/submariner/shipyard-dapper-base:${BASE_BRANCH}
devel: Pulling from submariner/shipyard-dapper-base
fc811dadee24: Pull complete
865443303fbd: Pull complete
7e3d5872c873: Pull complete
8754e7e7fd33: Pull complete
730a0c2def2c: Pull complete
8a98e40c8eb2: Pull complete
f176b9b8e444: Pull complete
b4f15b33efc7: Pull complete
Digest: sha256:befdf30bdcdb0760e5fa54582471b800aab697606c811f898b88d2216edf13f8
Status: Downloaded newer image for quay.io/submariner/shipyard-dapper-base:devel
 ---> 4da2955b6c34
Step 3/8 : ARG PROJECT
 ---> Running in 76c54c18300d
Removing intermediate container 76c54c18300d
 ---> 26ef4e8714a0
Step 4/8 : ENV DAPPER_ENV="QUAY_USERNAME QUAY_PASSWORD CLUSTERS_ARGS DEPLOY_ARGS CLEANUP_ARGS E2E_ARGS RELEASE_ARGS MAKEFLAGS FOCUS SKIP PLUGIN E2E_TESTDIR GITHUB_USER GITHUB_TOKEN"     DAPPER_SOURCE=/go/src/github.com/submariner-io/${PROJECT} DAPPER_DOCKER_SOCKET=true
 ---> Running in 9f7ca6868a90
Removing intermediate container 9f7ca6868a90
 ---> 6e31c8b5c167
Step 5/8 : ENV DAPPER_OUTPUT=${DAPPER_SOURCE}/output
 ---> Running in 18172c2a4e8b
Removing intermediate container 18172c2a4e8b
 ---> 5ae4c03b5f52
Step 6/8 : WORKDIR ${DAPPER_SOURCE}
 ---> Running in ad3a16e53c8a
Removing intermediate container ad3a16e53c8a
 ---> eb0606753ddb
Step 7/8 : ENTRYPOINT ["/opt/shipyard/scripts/entry"]
 ---> Running in f38135a19641
Removing intermediate container f38135a19641
 ---> 07f03d5b8016
Step 8/8 : CMD ["sh"]
 ---> Running in a0f6aa1ef0d2
Removing intermediate container a0f6aa1ef0d2
 ---> aa3b7e90023a
Successfully built aa3b7e90023a
Successfully tagged submariner:devel
[submariner]$ trap chown -R 217779353:10002 . exit
[submariner]$ mkdir -p bin dist output
[submariner]$ make --debug=b clusters
GNU Make 4.3
Built for x86_64-redhat-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Reading makefiles...
make: cmp: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
Updating makefiles....
Updating goal targets....
 File 'clusters' does not exist.
Must remake target 'clusters'.
/opt/shipyard/scripts/clusters.sh --settings /go/src/github.com/submariner-io/submariner/.shipyard.e2e.yml
make: /bin/bash: Operation not permitted
make: *** [/opt/shipyard/Makefile.inc:106: clusters] Error 127
[submariner]$ make --debug=b clusters
Makefile.dapper:24: recipe for target 'clusters' failed
make: *** [clusters] Error 2
skitt commented 3 years ago

Tried also Ubunto 18.04, same failure as CentOS:

That’s surprising, CI runs on Ubuntu 20.04 and works on Ubuntu 18.04 (see https://github.com/skitt/shipyard/pull/8). I wonder what the difference is between CI and your system!

I can reproduce this issue with Buildah on RHEL, I haven’t figured out what’s causing it yet.

Jaanki commented 3 years ago

I am facing similar error on Fedora 34.

Jaanki commented 2 years ago

I delete .dapper and Makefile.dapper from my submariner-operator local clone and manually copied these files from shipyard and make bin/subctl is now working.

tpantelis commented 2 years ago

I'm seeing the same running any make target (I'm on Fedora as well). Eg make golangci-lint,

Reading makefiles...
make: cmp: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
make: /bin/bash: Operation not permitted
Updating makefiles....
Updating goal targets....
 File 'golangci-lint' does not exist.
Must remake target 'golangci-lint'.
golangci-lint linters
make: /bin/bash: Operation not permitted

This just started recently.

skitt commented 2 years ago

I'm seeing the same running any make target (I'm on Fedora as well). Eg make golangci-lint,

Can you try deleting .dapper and Makefile.dapper?

tpantelis commented 2 years ago

I'm seeing the same running any make target (I'm on Fedora as well). Eg make golangci-lint,

Can you try deleting .dapper and Makefile.dapper?

I did - still failed.

skitt commented 2 years ago

I'm seeing the same running any make target (I'm on Fedora as well). Eg make golangci-lint,

Can you try deleting .dapper and Makefile.dapper?

I did - still failed.

OK, can you try replacing fedora:35 with fedora:34 in package/Dockerfile.shipyard-dapper-base and then rebuilding the images? This might be similar to https://bugzilla.redhat.com/show_bug.cgi?id=2025899

tpantelis commented 2 years ago

OK, can you try replacing fedora:35 with fedora:34 in package/Dockerfile.shipyard-dapper-base and then rebuilding the images? This might be similar to https://bugzilla.redhat.com/show_bug.cgi?id=2025899

I get failures:

#5 48.08 Fedora Modular 34 - x86_64                      181 kB/s | 562 kB     00:03    
#5 48.08 Errors during downloading metadata for repository 'fedora-modular':
#5 48.08   - Downloading successful, but checksum doesn't match. Calculated: 7d70d528ba308b58a51b3dcd0b0b33787c28ff800119d2cbfc4401289756c65c(sha256)  Expected: 4c61930e8ca0fe16c1dfeaad89a70ad871cc1e989d8b7e24da3b8ad0c7bfafef(sha256) 
#5 48.08   - Curl error (23): Failed writing received data to disk/application for http://mirror.siena.edu/fedora/linux/releases/34/Modular/x86_64/os/repodata/feba915c09693c88fc16a37c015cf9c89720aa009de7b5b402f45c0fd036ff40-primary.xml.zck [Failure writing output to destination]
#5 48.08   - Curl error (23): Failed writing received data to disk/application for http://mirror.siena.edu/fedora/linux/releases/34/Modular/x86_64/os/repodata/a51515cfe82aad9ac78f5cb3f461b67867626b2ec29f468882fbb60e756c15b7-filelists.xml.zck [Failure writing output to destination]
#5 48.10 Error: Failed to download metadata for repo 'fedora-modular': Yum repo downloading error: Downloading error(s): repodata/feba915c09693c88fc16a37c015cf9c89720aa009de7b5b402f45c0fd036ff40-primary.xml.zck - Download failed: Curl error (23): Failed writing received data to disk/application for http://mirror.siena.edu/fedora/linux/releases/34/Modular/x86_64/os/repodata/feba915c09693c88fc16a37c015cf9c89720aa009de7b5b402f45c0fd036ff40-primary.xml.zck [Failure writing output to destination]; repodata/a51515cfe82aad9ac78f5cb3f461b67867626b2ec29f468882fbb60e756c15b7-filelists.xml.zck - Download failed: Curl error (23): Failed writing received data to disk/application for http://mirror.siena.edu/fedora/linux/releases/34/Modular/x86_64/os/repodata/a51515cfe82aad9ac78f5cb3f461b67867626b2ec29f468882fbb60e756c15b7-filelists.xml.zck [Failure writing output to destination]

But I get weird errors on my VM with various things anyway.

nyechiel commented 2 years ago

Is this still being worked on? I don't see any such issues with my Fedora 35 system, but not sure about other OSs.

skitt commented 2 years ago

Is this still being worked on? I don't see any such issues with my Fedora 35 system, but not sure about other OSs.

Yes, I’m still working on this.

dfarrell07 commented 2 years ago

I sometimes get weird errors that show up as failed networking inside a container (like you quoted above @tpantelis, or from a fresh repo I think it shows up as .shflags failing to download), and it seems to be a longevity issue with the Docker daemon. A basic test of in-container networking will show if this is the issue:

docker run --rm alpine sh -c 'wget -q -O- https://docs.docker.com | grep "<title"'

If that fails, restarting my Docker daemon typically fixes it:

sudo systemctl restart docker

(but I don't think this is related to the OP)

tpantelis commented 2 years ago

If that fails, restarting my Docker daemon typically fixes it:

I restarted the Docker daemon but I still get the same issue, ie:

golangci-lint linters
make: /bin/bash: Operation not permitted

I'm on Fedora 33.

skitt commented 2 years ago

If that fails, restarting my Docker daemon typically fixes it:

I restarted the Docker daemon but I still get the same issue, ie:

golangci-lint linters
make: /bin/bash: Operation not permitted

I'm on Fedora 33.

OK — could you try https://github.com/submariner-io/submariner/issues/1595#issuecomment-979008446 and let me know if that fixes things?

tpantelis commented 2 years ago

OK — could you try #1595 (comment) and let me know if that fixes things?

Building the images fails with another error related to SSL (tls: bad record MAC). I've had such issues for quite a while on my VM.

mkolesnik commented 2 years ago

I think if host OS version is too old than the one we use for the build container this fails, I'm now on F35 on my laptop and everything works as it should. Perhaps the older OSes don't have necessary kernel support for the newest kernel that's expected inside the container?

Just to clarify, I suspect it's because older Fedora versions are EOL. In Ubuntu LTS everything should be working (at least for 20.04 we know it does since that's what the CI uses).

skitt commented 2 years ago

The “operation not permitted” errors are caused by seccomp filtering in the container runtime not knowing about clone3, which is used by glibc 2.34. See https://pascalroeleven.nl/2021/09/09/ubuntu-21-10-and-fedora-35-in-docker/

I’ll change .dapper to used an unconfined security context for the time being.