Closed cockroach-teamcity closed 1 year ago
SHA: https://github.com/cockroachdb/cockroach/commits/5ebfeec052f9cee4e63757defe7c9120643293db
Parameters:
To repro, try:
# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=jepsen-batch1/bank-multitable/subcritical-skews PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1174810&tab=buildLog
The test failed on release-2.1:
jepsen.go:247,jepsen.go:308,test.go:1214: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1174810-jepsen-batch1:6 -- bash -e -c "\
cd /mnt/data1/jepsen/cockroachdb && set -eo pipefail && \
~/lein run test \
--tarball file://${PWD}/cockroach.tgz \
--username ${USER} \
--ssh-private-key ~/.ssh/id_rsa \
--os ubuntu \
--time-limit 300 \
--concurrency 30 \
--recovery-time 25 \
--test-count 1 \
-n 10.142.0.47 -n 10.142.0.38 -n 10.142.0.44 -n 10.142.0.36 -n 10.142.0.41 \
--test bank-multitable --nemesis subcritical-skews \
> invoke.log 2>&1 \
" returned:
stderr:
stdout:
Error: exit status 255
: exit status 1
SHA: https://github.com/cockroachdb/cockroach/commits/7ce9188c6e64465d9dcb9f0ca0f113dd0e584da0
Parameters:
To repro, try:
# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=jepsen-batch1/bank-multitable/subcritical-skews PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1178908&tab=buildLog
The test failed on release-2.1:
jepsen.go:247,jepsen.go:308,test.go:1214: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1178908-jepsen-batch1:6 -- bash -e -c "\
cd /mnt/data1/jepsen/cockroachdb && set -eo pipefail && \
~/lein run test \
--tarball file://${PWD}/cockroach.tgz \
--username ${USER} \
--ssh-private-key ~/.ssh/id_rsa \
--os ubuntu \
--time-limit 300 \
--concurrency 30 \
--recovery-time 25 \
--test-count 1 \
-n 10.142.0.39 -n 10.142.0.159 -n 10.142.0.38 -n 10.142.0.36 -n 10.142.0.160 \
--test bank-multitable --nemesis subcritical-skews \
> invoke.log 2>&1 \
" returned:
stderr:
stdout:
Error: exit status 255
: exit status 1
The subcritical-skews nemesis resynchronizes with ntp frequently. This has recently started failing because we're getting rate-limited by the NTP server (it hard-codes ntp.ubuntu.com).
We need to either
Clearing the milestone so this gets re-triaged.
While looking on other issues connected to jepsen tests I found that recent jepsen packages use pool.ntp.org instead of ntp.ubuntu.org.
I changed it and gave it a try and surprise we are not throttled by pool and I see no more complains in the log.
Since we have server address hardcoded into our tests it should be a quick win so that we could have tests reenabled.
With jepsen change in place, I'll make a diff and see if it works or not. Running those tests with roachtest from dev looked fine.
cc @cockroachdb/test-eng
SHA: https://github.com/cockroachdb/cockroach/commits/a119a3a158725c9e3f9b8084d9398601c0e67007
Parameters:
To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1170795&tab=buildLog
Jira issue: CRDB-4573