cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.63k stars 3.71k forks source link

build: verify RHEL/Centos 7 can execute CRDB after upgrading crosstool-ng configuration #84196

Open srosenberg opened 2 years ago

srosenberg commented 2 years ago

In [1], both kernel and glibc versions have been bumped to 3.10 and 2.17, respectively. We want to verify that this change did not break backward compatibility with RHEL/Centos 7.

Once CI is using the new cross-build configuration, grab the latest build from master and do a 3-node TPC-C run. Record the results for posterity.

[1] https://github.com/cockroachdb/cockroach/pull/83751

Jira issue: CRDB-17520

srosenberg commented 2 years ago

cross-tool ng has been updated in master as of this PR: https://github.com/cockroachdb/cockroach/pull/84283

srosenberg commented 2 years ago

Note to self: we should probably update mkrelease.sh and related TC scripts which refer to the previous kernel version,

grep -r 2.6.32 build/
build/teamcity-acceptance.sh:   -l "$TMPDIR" -b "$PWD/cockroach-linux-2.6.32-gnu-amd64"
build/toolchains/toolchainbuild/crosstool-ng/x86_64-unknown-linux-gnu.config:# CT_LINUX_V_2_6_32 is not set
build/toolchains/toolchainbuild/crosstool-ng/s390x-ibm-linux-gnu.config:CT_LINUX_V_2_6_32=y
build/toolchains/toolchainbuild/crosstool-ng/s390x-ibm-linux-gnu.config:CT_LINUX_VERSION="2.6.32.71"
build/toolchains/toolchainbuild/crosstool-ng/s390x-ibm-linux-gnu.config:CT_GLIBC_MIN_KERNEL="2.6.32.71"
build/teamcity-weekly-roachtest.sh:if [[ -f cockroach-linux-2.6.32-gnu-amd64 ]]; then
build/teamcity-weekly-roachtest.sh:  mv cockroach-linux-2.6.32-gnu-amd64 cockroach.linux-2.6.32-gnu-amd64
build/teamcity-weekly-roachtest.sh:chmod +x cockroach.linux-2.6.32-gnu-amd64
build/teamcity-weekly-roachtest.sh:  --cockroach "$PWD/cockroach.linux-2.6.32-gnu-amd64" \
build/bazelutil/distdir_files.bzl:    "https://storage.googleapis.com/cockroach-godeps/gomod/github.com/coreos/go-etcd/com_github_coreos_go_etcd-v2.0.0+incompatible.zip": "4b226732835b9298af65db5d075024a5971aa11ef4b456899a3830bccd435b07",
build/teamcity-compile-builds.sh:build/builder.sh mkrelease linux-gnu SUFFIX=.linux-2.6.32-gnu-amd64
build/builder/mkrelease.sh:#   - amd64-linux-gnu:      amd64, Linux 2.6.32, dynamically link glibc 2.12.2
build/builder/mkrelease.sh:#   - s390x-linux-gnu:      s390x, Linux 2.6.32, dynamically link glibc 2.12.2
build/builder/mkrelease.sh:      SUFFIX=-linux-2.6.32-gnu-amd64
build/builder/mkrelease.sh:      SUFFIX=-linux-2.6.32-gnu-s390x
build/teamcity/cockroach/ci/builds/build_antithesis_docker_image.sh:cp cockroachshort-linux-2.6.32-gnu-amd64 build/deploy/workload
build/README.md:go/src/github.com/cockroachdb/cockroach $ cp ./cockroach-linux-2.6.32-gnu-amd64 build/deploy/cockroach
build/teamcity-nightly-roachtest.sh:  --cockroach="${PWD}/cockroach-linux-2.6.32-gnu-amd64" \
build/teamcity-build-test-binary.sh:run mv cockroach-linux-2.6.32-gnu-amd64 artifacts/cockroach
srosenberg commented 1 year ago

Notes

roachprod expects a custom AMI to be in the same project (i.e., cockroach-ephemeral), so we need to copy it from centos-cloud,

gcloud compute --project="cockroach-ephemeral" images create "centos-7-v20220719" --description="CentOS, CentOS, 7, x86_64 built on 20220719, supports Shielded VM features" --family="centos-7" --labels="" --source-image="centos-7-v20220719" --source-image-project="centos-cloud"

Attempt to provision fails during startup due to the missing tempfile,

roachprod create stan-centos-bench -n 3 --clouds gce --gce-machine-type "n2-standard-8" --gce-zones="us-central1-a,us-central1-b,us-central1-c"  --gce-pd-volume-type="pd-ssd"  --gce-min-cpu-platform="Intel Cascade Lake" --local-ssd="false"   --gce-pd-volume-size="2500" --gce-image="centos-7-v20220719" --os-volume-size=100
(1) attached stack trace
  -- stack trace:
  | github.com/cockroachdb/cockroach/pkg/roachprod/install.(*SyncedCluster).SetupSSH.func4
  |     /Users/srosenberg/workspace/go/src/github.com/cockroachdb/cockroach/pkg/roachprod/install/cluster_synced.go:941
  | github.com/cockroachdb/cockroach/pkg/roachprod/install.(*SyncedCluster).ParallelE.func1.1
  |     /Users/srosenberg/workspace/go/src/github.com/cockroachdb/cockroach/pkg/roachprod/install/cluster_synced.go:1976
  | runtime.goexit
  |     /usr/local/go/src/runtime/asm_arm64.s:1133
Wraps: (2)
  | set -e
  | tmp="$(tempfile -d ~/.ssh -p 'roachprod' )"
  | on_exit() {
  |     rm -f "${tmp}"
  | }
  | trap on_exit EXIT
  | for i in {1..20}; do
  |   ssh-keyscan -T 60 -t rsa 10.128.0.33 10.128.0.27 10.128.0.24 34.66.241.153 34.135.143.29 34.172.28.47 > "${tmp}"
  |   if [[ "$(wc < ${tmp} -l)" -eq "6" ]]; then
  |     [[ -f .ssh/known_hosts ]] && cat .ssh/known_hosts >> "${tmp}"
  |     sort -u < "${tmp}"
  |     exit 0
  |   fi
  |   sleep 1
  | done
  | exit 1: stderr:
  | bash: line 2: tempfile: command not found
ajwerner commented 1 year ago

Seems easy enough to fix this one. We need to stop trying to use tempfile which apparently is not in all of the linux distros we want to support. One approach could be to have roachprod create a random temp filename.