Closed cockroach-teamcity closed 1 year ago
(ccl/backupccl).TestScheduleBackupRecoversFromClusterDown failed on master@8aceac3c99c3addece3a9ef9af04cc74715cdb37:
| github.com/cockroachdb/cockroach/pkg/jobs.(*Job).succeeded
| /go/src/github.com/cockroachdb/cockroach/pkg/jobs/jobs.go:594
| github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine
| /go/src/github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1139
| github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine
| /go/src/github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1093
| github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).runJob
| /go/src/github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:245
| github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).resumeJob.func1
| /go/src/github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:182
| github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1
| /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:347
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357
Wraps: (4) log-job
Wraps: (5) context canceled
Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *errors.errorString
W201025 05:39:46.055576 474836 kv/txn.go:618 [n1] failure aborting transaction: node unavailable; try another peer; abort caused by: context canceled
E201025 05:39:46.056424 474836 jobs/adopt.go:256 [n1] job 601402731232919553: adoption completed with error job 601402731232919553: could not mark as reverting: could not mark job 601402731232919553 as succeeded: log-job: context canceled: context canceled
W201025 05:39:46.073884 475145 kv/txn.go:618 failure aborting transaction: node unavailable; try another peer; abort caused by: failed to fetch protectedts metadata: failed to read metadata: protectedts-GetMetadata: context canceled
I201025 05:39:46.089299 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (gc): node unavailable; try another peer
I201025 05:39:46.090071 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (merge): node unavailable; try another peer
I201025 05:39:46.090341 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (split): node unavailable; try another peer
I201025 05:39:46.090657 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (replicate): node unavailable; try another peer
I201025 05:39:46.091043 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (replicaGC): node unavailable; try another peer
W201025 05:39:46.091598 475300 kv/txn.go:618 [intExec=select-running/get-claimed-jobs] failure aborting transaction: node unavailable; try another peer; abort caused by: query execution canceled
I201025 05:39:46.092636 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (raftsnapshot): node unavailable; try another peer
I201025 05:39:46.093245 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (consistencyChecker): node unavailable; try another peer
E201025 05:39:46.093358 472813 jobs/registry.go:654 error processing claimed jobs: could query for claimed jobs: select-running/get-claimed-jobs: context canceled
I201025 05:39:46.093647 472691 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (timeSeriesMaintenance): node unavailable; try another peer
W201025 05:39:46.094383 472813 kv/txn.go:618 failure aborting transaction: node unavailable; try another peer; abort caused by: context canceled
E201025 05:39:46.094960 472813 jobs/registry.go:644 error claiming jobs: context canceled
E201025 05:39:46.097088 472813 jobs/registry.go:654 error processing claimed jobs: could query for claimed jobs: select-running/get-claimed-jobs: context canceled
--- FAIL: TestScheduleBackupRecoversFromClusterDown/reschedule (438.56s)
create_scheduled_backup_test.go:113: condition failed to evaluate within 45s: sql: no rows in result set
goroutine 474603 [running]:
runtime/debug.Stack(0xc06c85bbe0, 0x80cea00, 0xc000438060)
/usr/local/go/src/runtime/debug/stack.go:24 +0xab
github.com/cockroachdb/cockroach/pkg/testutils.SucceedsSoon(0x82e2d40, 0xc035544900, 0xc06c85bbe0)
/go/src/github.com/cockroachdb/cockroach/pkg/testutils/soon.go:36 +0x87
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.(*testHelper).waitForSuccessfulScheduledJob(0xc092907440, 0xc035544900, 0x8589c97e8f90001)
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/create_scheduled_backup_test.go:113 +0x12e
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestScheduleBackupRecoversFromClusterDown.func4(0xc035544900, 0x61de8b3, 0xa, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/create_scheduled_backup_test.go:703 +0x575
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestScheduleBackupRecoversFromClusterDown.func9(0xc035544900)
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/create_scheduled_backup_test.go:784 +0xba
testing.tRunner(0xc035544900, 0xc037588d00)
/usr/local/go/src/testing/testing.go:909 +0x19a
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:960 +0x652
Parameters: - TAGS= - GOFLAGS=-race -parallel=2 ``` make stressrace TESTS=TestScheduleBackupRecoversFromClusterDown PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 ``` [See this test on roachdash](https://roachdash.crdb.dev/?filter=status%3Aopen+t%3A.%2ATestScheduleBackupRecoversFromClusterDown.%2A&sort=title&restgroup=false&display=lastcommented+project) powered by [pkg/cmd/internal/issues](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/internal/issues)
(ccl/backupccl).TestScheduleBackupRecoversFromClusterDown failed on master@157b6dc59552b26fb7ef7968ea3b79577c336423:
W201029 05:49:50.828950 478124 kv/kvserver/store_raft.go:492 [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 0.6s [applied=1, batches=1, state_assertions=0]
W201029 05:49:51.463233 477968 kv/kvserver/node_liveness.go:815 [n1,liveness-hb] slow heartbeat took 2.760854915s; err=<nil>
W201029 05:49:52.936013 478110 kv/kvserver/store_raft.go:492 [n1,s1,r4/1:/System{/tsd-tse}] handle raft ready: 0.7s [applied=1, batches=1, state_assertions=0]
W201029 05:49:54.478814 478115 kv/kvserver/store_raft.go:492 [n1,s1,r1/1:/{Min-System/NodeL…}] handle raft ready: 0.5s [applied=1, batches=1, state_assertions=0]
W201029 05:49:54.583169 478089 kv/kvserver/store_raft.go:492 [n1,s1,r2/1:/System/NodeLiveness{-Max}] handle raft ready: 0.7s [applied=1, batches=1, state_assertions=0]
W201029 05:49:54.926130 477968 kv/kvserver/node_liveness.go:815 [n1,liveness-hb] slow heartbeat took 1.617790754s; err=<nil>
I201029 05:49:55.811300 481196 ccl/backupccl/create_scheduled_backup_test.go:113 SucceedsSoon: sql: no rows in result set
I201029 05:49:55.882118 477966 server/status/runtime.go:522 [n1] runtime stats: 24 GiB RSS, 226 goroutines, 1.8 GiB/1.0 GiB/2.9 GiB GO alloc/idle/total, 29 MiB/73 MiB CGO alloc/total, 0.1 CGO/sec, 13.9/6.8 %(u/s)time, 0.0 %gc (0x), 1.8 KiB/1.8 KiB (r/w)net
I201029 05:49:56.315022 478000 util/stop/stopper.go:562 quiescing
I201029 05:49:56.500458 478179 kv/kvclient/kvcoord/transport_race.go:108 transport race promotion: ran 20 iterations on up to 8060 requests
E201029 05:49:56.506940 478009 jobs/registry.go:628 error expiring job sessions: expire-sessions: context canceled
W201029 05:49:56.475579 478008 sql/sqlliveness/slinstance/slinstance.go:181 [n1] exiting heartbeat loop
W201029 05:49:56.568919 478009 kv/txn.go:618 failure aborting transaction: node unavailable; try another peer; abort caused by: context canceled
E201029 05:49:56.625473 478009 jobs/registry.go:633 failed to serve pause and cancel requests: context canceled
W201029 05:49:56.625639 478009 jobs/registry.go:676 canceling all adopted jobs due to stopper quiescing
W201029 05:49:56.686637 478011 kv/txn.go:618 failure aborting transaction: node unavailable; try another peer; abort caused by: could not query jobs table: claim-jobs: context canceled
E201029 05:49:56.687382 478011 jobs/registry.go:643 error claiming jobs: could not query jobs table: claim-jobs: context canceled
W201029 05:49:56.705983 481553 kv/txn.go:618 [n1,job=602538332238446593] failure aborting transaction: node unavailable; try another peer; abort caused by: job-update: failed to send RPC: sending to all replicas failed; last error: node unavailable; try another peer
E201029 05:49:56.690347 477964 kv/kvserver/reports/reporter.go:141 [n1,replication-reporter] failed to generate replication reports: failed to save locality report: get-previous-timestamp: failed to send RPC: sending to all replicas failed; last error: node unavailable; try another peer
I201029 05:49:56.505039 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (gc): node unavailable; try another peer
I201029 05:49:56.773607 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (merge): node unavailable; try another peer
I201029 05:49:56.774094 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (split): node unavailable; try another peer
I201029 05:49:56.774320 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (replicate): node unavailable; try another peer
I201029 05:49:56.774531 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (replicaGC): node unavailable; try another peer
I201029 05:49:56.774740 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (raftlog): node unavailable; try another peer
I201029 05:49:56.775011 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (raftsnapshot): node unavailable; try another peer
I201029 05:49:56.775307 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (consistencyChecker): node unavailable; try another peer
I201029 05:49:56.775559 478202 kv/kvserver/queue.go:582 [n1,s1] rate limited in MaybeAdd (timeSeriesMaintenance): node unavailable; try another peer
E201029 05:49:56.786350 481543 jobs/adopt.go:256 [n1] job 602538332238446593: adoption completed with error job 602538332238446593: node liveness error: restarting in background
W201029 05:49:56.832131 482347 sql/catalog/lease/lease.go:295 error releasing lease &{'5' '\x01' {{%!q(uint64=396727000) %!q(int64=63739547530) %!q(*time.Location=<nil>)}}}: lease-release: failed to send RPC: sending to all replicas failed; last error: node unavailable; try another peer
W201029 05:49:56.822858 482405 sql/catalog/lease/lease.go:295 error releasing lease &{'\x17' '\x01' {{%!q(uint64=73482000) %!q(int64=63739547550) %!q(*time.Location=<nil>)}}}: lease-release: failed to send RPC: sending to all replicas failed; last error: node unavailable; try another peer
W201029 05:49:56.832173 482349 sql/catalog/lease/lease.go:295 error releasing lease &{'4' '\x01' {{%!q(uint64=862217000) %!q(int64=63739547539) %!q(*time.Location=<nil>)}}}: lease-release: failed to send RPC: sending to all replicas failed; last error: node unavailable; try another peer
E201029 05:49:56.927694 478011 jobs/registry.go:653 error processing claimed jobs: could query for claimed jobs: select-running/get-claimed-jobs: context canceled
--- FAIL: TestScheduleBackupRecoversFromClusterDown/reschedule (100.63s)
create_scheduled_backup_test.go:113: condition failed to evaluate within 45s: sql: no rows in result set
goroutine 481196 [running]:
runtime/debug.Stack(0xc06f8ebbe0, 0x80f16e0, 0xc0000fe940)
/usr/local/go/src/runtime/debug/stack.go:24 +0xab
github.com/cockroachdb/cockroach/pkg/testutils.SucceedsSoon(0x83048c0, 0xc0625d3900, 0xc06f8ebbe0)
/go/src/github.com/cockroachdb/cockroach/pkg/testutils/soon.go:36 +0x87
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.(*testHelper).waitForSuccessfulScheduledJob(0xc0be421400, 0xc0625d3900, 0x85ca56729218001)
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/create_scheduled_backup_test.go:113 +0x12e
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestScheduleBackupRecoversFromClusterDown.func4(0xc0625d3900, 0x61e773b, 0xa, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/create_scheduled_backup_test.go:703 +0x4b2
github.com/cockroachdb/cockroach/pkg/ccl/backupccl.TestScheduleBackupRecoversFromClusterDown.func9(0xc0625d3900)
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/create_scheduled_backup_test.go:789 +0xba
testing.tRunner(0xc0625d3900, 0xc078d733c0)
/usr/local/go/src/testing/testing.go:909 +0x19a
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:960 +0x652
Parameters: - TAGS= - GOFLAGS=-race -parallel=2 ``` make stressrace TESTS=TestScheduleBackupRecoversFromClusterDown PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 ``` [See this test on roachdash](https://roachdash.crdb.dev/?filter=status%3Aopen+t%3A.%2ATestScheduleBackupRecoversFromClusterDown.%2A&sort=title&restgroup=false&display=lastcommented+project) powered by [pkg/cmd/internal/issues](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/internal/issues)
Working on it.
(ccl/backupccl).TestScheduleBackupRecoversFromClusterDown failed on master@9ec0151e0079c51afeec2f3f75a4747ce80a615b:
Fatal error:
F201102 06:49:57.244259 512468 sqlmigrations/migrations.go:667 [n1] not enough time left on migration lease, terminating for safety
Stack:
goroutine 512468 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0xb953001, 0x0, 0x0, 0x61ec2cc)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xc6
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0xaf96be0, 0x4, 0x16439e2a67f9ac4e, 0x7d1d4, 0xa9c9627, 0x1b, 0x29b, 0xc0ceb52440, 0x3f, 0xc0d35c2bb8, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:243 +0xb31
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x81a7000, 0xc05a331ad0, 0x4, 0x2, 0x62d8a20, 0x3f, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:52 +0x19a
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x81a7000, 0xc05a331ad0, 0x1, 0xc000000004, 0x62d8a20, 0x3f, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:43 +0x9a
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:172
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x81a7000, 0xc05a331ad0)
/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:667 +0x347
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc07c849400, 0x81a7000, 0xc05a331ad0, 0xc05ac1e9f0, 0x29, 0x0, 0xc040b8f120)
/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:346 +0x156
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:341 +0x135
```
An inability to maintain liveness will prevent a node from participating in a
cluster. If this problem persists, it may be a sign of resource starvation or
of network connectivity problems. For help troubleshooting, visit:
https://www.cockroachlabs.com/docs/stable/cluster-setup-troubleshooting.html#node-liveness-issues
E201102 06:49:55.169812 512538 kv/kvserver/queue.go:1087 [n1,consistencyChecker,s1,r16/1:/Table/2{0-1}] operation "consistencyChecker queue process replica 16" timed out after 1m0s
I201102 06:49:55.189940 512285 kv/kvserver/store.go:2618 [n1,s1]
__level_____count____size___score______in__ingest(sz_cnt)____move(sz_cnt)___write(sz_cnt)____read___r-amp___w-amp
WAL 2 306 K - 304 K - - - - 306 K - - - 1.0
0 0 0 B 0.00 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
1 0 0 B 0.00 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
2 0 0 B 0.00 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
3 0 0 B 0.00 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
4 0 0 B 0.00 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
5 0 0 B 0.00 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
6 0 0 B - 0 B 0 B 0 0 B 0 0 B 0 0 B 0 0.0
total 0 0 B - 306 K 0 B 0 0 B 0 306 K 0 0 B 0 1.0
flush 0
compact 0 0 B 0 B (size == estimated-debt, in = in-progress-bytes)
memtbl 2 768 K
zmemtbl 0 0 B
ztbl 0 0 B
bcache 0 0 B 0.0% (score == hit-rate)
tcache 0 0 B 0.0% (score == hit-rate)
titers 0
filter - - 0.0% (score == utility)
I201102 06:49:55.237873 512329 kv/kvserver/replica_range_lease.go:1083 [n1,s1,r2/1:/System/NodeLiveness{-Max}] slow lease acquisition finished after 11m45.053605946s with error
Parameters: - TAGS= - GOFLAGS=-race -parallel=2 ``` make stressrace TESTS=TestScheduleBackupRecoversFromClusterDown PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 ``` [See this test on roachdash](https://roachdash.crdb.dev/?filter=status%3Aopen+t%3A.%2ATestScheduleBackupRecoversFromClusterDown.%2A&sort=title&restgroup=false&display=lastcommented+project) powered by [pkg/cmd/internal/issues](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/internal/issues)
(ccl/backupccl).TestScheduleBackupRecoversFromClusterDown failed on master@df21e5065a79d1dd5b0fd76018724dd26660beb2:
Fatal error:
F201111 06:39:46.312768 510181 sqlmigrations/migrations.go:667 [n1] not enough time left on migration lease, terminating for safety
Stack:
goroutine 510181 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0xbc74f01, 0x0, 0x0, 0x6404598)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/get_stacks.go:25 +0xc6
github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0xb2b8a20, 0x4, 0x164660d54565ec16, 0x7c8e5, 0xacea874, 0x1b, 0x29b, 0xc076875740, 0x3f, 0xc08b2f3e50, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:243 +0xb31
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x8424000, 0xc06726d3b0, 0x4, 0x2, 0x64f189a, 0x3f, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:52 +0x19a
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x8424000, 0xc06726d3b0, 0x1, 0xc000000004, 0x64f189a, 0x3f, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:43 +0x9a
github.com/cockroachdb/cockroach/pkg/util/log.Fatal(...)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:172
github.com/cockroachdb/cockroach/pkg/sqlmigrations.(*Manager).EnsureMigrations.func2(0x8424000, 0xc06726d3b0)
/go/src/github.com/cockroachdb/cockroach/pkg/sqlmigrations/migrations.go:667 +0x347
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc079d1d220, 0x8424000, 0xc06726d3b0, 0xc08c0e6ba0, 0x29, 0x0, 0xc080900bc0)
/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:346 +0x156
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:341 +0x135
```
| runtime.doInit
| /usr/local/go/src/runtime/proc.go:5223
| runtime.doInit
| /usr/local/go/src/runtime/proc.go:5223
| runtime.doInit
| /usr/local/go/src/runtime/proc.go:5223
| runtime.main
| /usr/local/go/src/runtime/proc.go:190
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1357
Wraps: (2) not live
Error types: (1) *withstack.withStack (2) *errutil.leafError
I201111 06:39:45.283786 509912 server/status/runtime.go:522 [n1] runtime stats: 25 GiB RSS, 233 goroutines, 2.0 GiB/1.0 GiB/3.6 GiB GO alloc/idle/total(stale), 22 MiB/62 MiB CGO alloc/total, 0.6 CGO/sec, 3606.7/605.7 %(u/s)time, 8.7 %gc (1x), 208 B/208 B (r/w)net
I201111 06:39:44.932218 510613 kv/kvserver/replica_write.go:182 [n1,s1,r2/1:/System/NodeLiveness{-Max}] slow command RequestLease [/System/NodeLiveness,/Min) finished after 205.66s with error
Parameters: - TAGS= - GOFLAGS=-race -parallel=2 ``` make stressrace TESTS=TestScheduleBackupRecoversFromClusterDown PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 ``` [See this test on roachdash](https://roachdash.crdb.dev/?filter=status%3Aopen+t%3A.%2ATestScheduleBackupRecoversFromClusterDown.%2A&sort=title&restgroup=false&display=lastcommented+project) powered by [pkg/cmd/internal/issues](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/internal/issues)
ccl/backupccl.TestScheduleBackupRecoversFromClusterDown failed with artifacts on master @ 2f051c95ebb991e18134d2454807f992b05e4929:
=== RUN TestScheduleBackupRecoversFromClusterDown/retry
See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM) Parameters in this failure: - TAGS=bazel,gss
ccl/backupccl.TestScheduleBackupRecoversFromClusterDown failed with artifacts on master @ 29d21029fa8e3936c1981ecaacd2b882a4bbb1cc:
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x39f327c]
goroutine 38375 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).recover(0x4003aee6a8?, {0x6867890, 0x400679ad80})
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:229 +0x68
panic({0x4a64b80, 0x96967d0})
GOROOT/src/runtime/panic.go:884 +0x20c
github.com/cockroachdb/cockroach/pkg/jobs.(*ScheduledJob).ExecutorType(...)
github.com/cockroachdb/cockroach/pkg/jobs/scheduled_job.go:173
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.processSchedulePTSRecord({0x6867890?, 0x4003193fb0?}, 0x4003aee968?, {{0x400c32e600, 0x10, 0x10}, {0x174c7a3ad4053bc8, 0x0, 0x0}, 0x0, ...}, ...)
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:227 +0x9c
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.manageProtectedTimestamps.func1({0x6867890, 0x4003193fb0}, {0x68be520, 0x4004a6f860})
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:124 +0x268
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).Txn.func1({0x6867890?, 0x4003193fb0?}, 0x68ce400?)
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1406 +0x38
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).txn.func4({0x6867890, 0x4003193fb0}, 0x4004bb7ef0)
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1484 +0x2ac
github.com/cockroachdb/cockroach/pkg/kv.runTxn.func1({0x6867890?, 0x4003193fb0?}, 0x4003aeedb8?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:965 +0x2c
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec(0x4004bb7ef0, {0x6867890, 0x4003193fb0}, 0x4003aef0e8)
github.com/cockroachdb/cockroach/pkg/kv/txn.go:928 +0x70
github.com/cockroachdb/cockroach/pkg/kv.runTxn({0x6867890, 0x4003193fb0}, 0x4003193fb0?, 0x45f4ec?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:964 +0x58
github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl(0xffffb6643d28?, {0x6867890, 0x4003193fb0}, 0xb9af640?, 0x40?, 0x0?, 0x0?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:927 +0xa8
github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn(0x38?, {0x6867890?, 0x4003193fb0?}, 0x589400?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:902 +0x34
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).txn(0x40017e9dd0, {0x6867890, 0x4003193fb0}, 0x400950e400, {0x0, 0x0, 0x10000000009d01?})
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1472 +0x20c
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).Txn(0x4a5ea60?, {0x6867890, 0x4003193fb0}, 0x4000ddbc20, {0x0, 0x0, 0x0})
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1407 +0x94
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.manageProtectedTimestamps({0x6867890, 0x4003193fb0}, {0x68b9990, 0x4008a3c218})
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:105 +0xec
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.(*metricsPoller).Resume.func1({0x52e00e2?, 0x4003aef448?}, 0x5689c48)
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/poller.go:64 +0xb8
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.(*metricsPoller).Resume(0x4004a0e1a0?, {0x6867890, 0x4003193fb0}, {0x4ffa180?, 0x4008a3c218})
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/poller.go:75 +0x238
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine.func2(0x4004a0e1a0?, 0x0?, {0x683ac38?, 0x4008a3c200?}, {0x6867890?, 0x400b0daa80?}, {0x4ffa180?, 0x4008a3c218?}, 0x4003aef860)
github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1529 +0xf4
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine(0x4007974d80, {0x6867890, 0x400679afc0}, {0x4ffa180, 0x4008a3c218}, {0x683ac38, 0x4008a3c200}, 0x4004a0e1a0, {0x400bf37528, 0x7}, ...)
github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1530 +0x6a0
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).runJob(0x4007974d80, {0x6867890, 0x400679ad50}, {0x683ac38, 0x4008a3c200}, 0x4004a0e1a0, {0x400bf37528, 0x7}, {0x400bf37700, 0x7})
github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:417 +0x500
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).resumeJob.func1({0x4004b06000?, 0x4c09f4?})
github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:335 +0xc0
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x128
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x358
=== RUN TestScheduleBackupRecoversFromClusterDown/retry
See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)
ccl/backupccl.TestScheduleBackupRecoversFromClusterDown failed with artifacts on master @ 9b43dc50d468171bc9919dc684b7274b63e13da2:
Fatal error:
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x40ca27b]
Stack:
goroutine 944694 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).recover(0x1e8ee8a?, {0x70ace30, 0xc02272d6b0})
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:229 +0x6a
panic({0x52ab880, 0x9f9f7d0})
GOROOT/src/runtime/panic.go:884 +0x212
github.com/cockroachdb/cockroach/pkg/jobs.(*ScheduledJob).ExecutorType(...)
github.com/cockroachdb/cockroach/pkg/jobs/scheduled_job.go:173
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.processSchedulePTSRecord({0x70ace30?, 0xc01089a570?}, 0xc002bf29b8?, {{0xc021899970, 0x10, 0x10}, {0x174c8100c9c8e9f8, 0x0, 0x0}, 0x0, ...}, ...)
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:227 +0x9b
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.manageProtectedTimestamps.func1({0x70ace30, 0xc01089a570}, {0x7103820, 0xc0127d3ea0})
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:124 +0x345
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).Txn.func1({0x70ace30?, 0xc01089a570?}, 0x7113798?)
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1406 +0x31
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).txn.func4({0x70ace30, 0xc01089a570}, 0xc020f9b550)
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1484 +0x3b4
github.com/cockroachdb/cockroach/pkg/kv.runTxn.func1({0x70ace30?, 0xc01089a570?}, 0xc010a06ea0?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:965 +0x27
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec(0xc020f9b550, {0x70ace30, 0xc01089a570}, 0xc002bf3110)
github.com/cockroachdb/cockroach/pkg/kv/txn.go:928 +0xae
github.com/cockroachdb/cockroach/pkg/kv.runTxn({0x70ace30, 0xc01089a570}, 0xc01089a570?, 0x3b9aca00?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:964 +0x6b
github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl(0x7fd352653108?, {0x70ace30, 0xc01089a570}, 0x20c514c0?, 0xc0?, 0x0?, 0x0?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:927 +0xa7
github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn(0x463207?, {0x70ace30?, 0xc01089a570?}, 0x1?)
github.com/cockroachdb/cockroach/pkg/kv/db.go:902 +0x2d
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).txn(0xc006aa1830, {0x70ace30, 0xc01089a570}, 0xc000da07e0, {0x0, 0x0, 0x56921e0?})
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1472 +0x2a7
github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).Txn(0x52a5760?, {0x70ace30, 0xc01089a570}, 0xc009b7fa80, {0x0, 0x0, 0x0})
github.com/cockroachdb/cockroach/pkg/sql/internal.go:1407 +0x9f
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.manageProtectedTimestamps({0x70ace30, 0xc01089a570}, {0x70fec10, 0xc0004b46a0})
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:105 +0xf0
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.(*metricsPoller).Resume.func1({0x5b26d67?, 0xc001c9b460?}, 0x5ed0698)
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/poller.go:64 +0xb1
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.(*metricsPoller).Resume(0xc01439f1e0?, {0x70ace30, 0xc01089a570}, {0x5840f80?, 0xc0004b46a0})
github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/poller.go:75 +0x31a
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine.func2(0xc01439f1e0?, 0x0?, {0x7080468?, 0xc0004b4648?}, {0x70ace30?, 0xc012c16900?}, {0x5840f80?, 0xc0004b46a0?}, 0xc002bf3870)
github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1529 +0xcc
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine(0xc0055f6900, {0x70ace30, 0xc02272d920}, {0x5840f80, 0xc0004b46a0}, {0x7080468, 0xc0004b4648}, 0xc01439f1e0, {0xc017b31738, 0x7}, ...)
github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1530 +0x8ed
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).runJob(0xc0055f6900, {0x70ace30, 0xc02272d680}, {0x7080468, 0xc0004b4648}, 0xc01439f1e0, {0xc017b31738, 0x7}, {0xc017b318d0, 0x7})
github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:417 +0x5d7
github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).resumeJob.func1({0xc0048f9800?, 0xc02272cd80?})
github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:335 +0xfa
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x146
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x43b
``` * | GOROOT/src/runtime/panic.go:884 * | runtime.panicmem * | GOROOT/src/runtime/panic.go:260 * | runtime.sigpanic * | GOROOT/src/runtime/signal_unix.go:835 * | github.com/cockroachdb/cockroach/pkg/jobs.(*ScheduledJob).ExecutorType * | github.com/cockroachdb/cockroach/pkg/jobs/scheduled_job.go:173 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.processSchedulePTSRecord * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:227 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.manageProtectedTimestamps.func1 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:124 * | github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).Txn.func1 * | github.com/cockroachdb/cockroach/pkg/sql/internal.go:1406 * | github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).txn.func4 * | github.com/cockroachdb/cockroach/pkg/sql/internal.go:1484 * | github.com/cockroachdb/cockroach/pkg/kv.runTxn.func1 * | github.com/cockroachdb/cockroach/pkg/kv/db.go:965 * | github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec * | github.com/cockroachdb/cockroach/pkg/kv/txn.go:928 * | github.com/cockroachdb/cockroach/pkg/kv.runTxn * | github.com/cockroachdb/cockroach/pkg/kv/db.go:964 * | github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl * | github.com/cockroachdb/cockroach/pkg/kv/db.go:927 * | github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn * | github.com/cockroachdb/cockroach/pkg/kv/db.go:902 * | github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).txn * | github.com/cockroachdb/cockroach/pkg/sql/internal.go:1472 * | github.com/cockroachdb/cockroach/pkg/sql.(*InternalDB).Txn * | github.com/cockroachdb/cockroach/pkg/sql/internal.go:1407 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.manageProtectedTimestamps * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/job_statistics.go:105 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.(*metricsPoller).Resume.func1 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/poller.go:64 * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller.(*metricsPoller).Resume * | github.com/cockroachdb/cockroach/pkg/jobs/metricspoller/poller.go:75 * | github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine.func2 * | github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1529 * | github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).stepThroughStateMachine * | github.com/cockroachdb/cockroach/pkg/jobs/registry.go:1530 * | github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).runJob * | github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:417 * | github.com/cockroachdb/cockroach/pkg/jobs.(*Registry).resumeJob.func1 * | github.com/cockroachdb/cockroach/pkg/jobs/adopt.go:335 * | github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2 * | github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 * | runtime.goexit * | GOROOT/src/runtime/asm_amd64.s:1594 * Wraps: (2) runtime error: invalid memory address or nil pointer dereference * Error types: (1) *withstack.withStack (2) runtime.errorString * ```
Parameters: TAGS=bazel,gss
See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)
cc @cockroachdb/disaster-recovery
(ccl/backupccl).TestScheduleBackupRecoversFromClusterDown failed on master@75234294d22cc9a6e12724b043f49e78f0ac92d2:
More
Parameters: - TAGS= - GOFLAGS=-race -parallel=2 ``` make stressrace TESTS=TestScheduleBackupRecoversFromClusterDown PKG=./pkg/ccl/backupccl TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 ``` [See this test on roachdash](https://roachdash.crdb.dev/?filter=status%3Aopen+t%3A.%2ATestScheduleBackupRecoversFromClusterDown.%2A&sort=title&restgroup=false&display=lastcommented+project) powered by [pkg/cmd/internal/issues](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/internal/issues)
Jira issue: CRDB-3613