cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.07k stars 3.8k forks source link

roachtest: replicagc-changed-peers/restart=true failed #68269

Closed cockroach-teamcity closed 3 years ago

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ fe1fb73ae989142193643db30dd4b1b6dd6fe7dd:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[5:{3,4,5} 6:{3,4,6} 7:{3,4,6} 9:{3,4,5} 12:{3,4,6} 32:{3,5,6} 34:{3,5,6} 39:{3,4,6} 40:{3,5,6} 52:{3,5,6} 59:{3,5,6} 62:{3,5,6} 69:{3,5,6} 85:{3,4,5} 93:{3,4,5} 106:{3,5,6} 113:{3,4,6} 114:{3,4,6} 115:{3,4,5} 117:{3,4,6} 123:{3,5,6} 127:{3,5,6} 136:{3,4,5} 141:{3,4,5} 142:{3,5,6} 144:{3,4,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[5:{3,4,5} 6:{3,4,6} 7:{3,4,6} 9:{3,4,5} 12:{3,4,6} 32:{3,5,6} 34:{3,5,6} 39:{3,4,6} 40:{3,5,6} 52:{3,5,6} 59:{3,5,6} 62:{3,5,6} 69:{3,5,6} 85:{3,4,5} 93:{3,4,5} 106:{3,5,6} 113:{3,4,6} 114:{3,4,6} 115:{3,4,5} 117:{3,4,6} 123:{3,5,6} 127:{3,5,6} 136:{3,4,5} 141:{3,4,5} 142:{3,5,6} 144:{3,4,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3242835-1627625894-68-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: 11162
        1: 11098
        6: 10654
        2: 10558
        3: dead (exit status 137)
        5: 11242
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

erikgrinaker commented 3 years ago

See #66944 for details.

erikgrinaker commented 3 years ago

There was an attempted fix for this in #67916, but seems like it didn't take.

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 701b177d8f4b81d8654dfb4090a2cd3cf82e63a7:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[1:{3,6,7} 4:{3,7,8} 5:{3,7,8} 7:{3,6,8} 8:{3,7,8} 17:{3,6,7} 35:{3,6,7} 49:{3,6,7} 52:{3,6,8} 56:{3,7,8} 63:{3,6,8} 66:{3,6,7} 74:{3,7,8} 77:{3,6,7} 79:{3,6,7} 90:{3,7,8} 104:{3,6,8} 106:{3,6,8} 115:{3,7,8} 117:{3,6,7} 121:{3,6,8} 138:{3,6,8} 151:{3,6,8} 154:{3,6,7} 155:{3,6,8}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[1:{3,6,7} 4:{3,7,8} 5:{3,7,8} 7:{3,6,8} 8:{3,7,8} 17:{3,6,7} 35:{3,6,7} 49:{3,6,7} 52:{3,6,8} 56:{3,7,8} 63:{3,6,8} 66:{3,6,7} 74:{3,7,8} 77:{3,6,7} 79:{3,6,7} 90:{3,7,8} 104:{3,6,8} 106:{3,6,8} 115:{3,7,8} 117:{3,6,7} 121:{3,6,8} 138:{3,6,8} 151:{3,6,8} 154:{3,6,7} 155:{3,6,8}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3246869-1627712255-68-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 6: 10937
        4: 10773
        3: dead (exit status 137)
        1: 11722
        2: 10501
        5: 11157
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 31af9e32a55a166166e9ba9c5327b7cd847ae236:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[17:{3,5,6} 23:{3,4,6} 37:{3,4,6} 43:{3,5,6} 54:{3,4,6} 60:{3,4,6} 62:{3,5,6} 81:{3,4,5} 97:{3,4,6} 101:{3,4,6} 111:{3,4,6} 118:{3,5,6} 120:{3,4,6} 131:{3,4,5} 133:{3,4,5} 135:{3,5,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[17:{3,5,6} 23:{3,4,6} 37:{3,4,6} 43:{3,5,6} 54:{3,4,6} 60:{3,4,6} 62:{3,5,6} 81:{3,4,5} 97:{3,4,6} 101:{3,4,6} 111:{3,4,6} 118:{3,5,6} 120:{3,4,6} 131:{3,4,5} 133:{3,4,5} 135:{3,5,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3248647-1627798575-67-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 6: 10877
        5: 10647
        3: dead (exit status 137)
        4: 10700
        2: 11419
        1: 11263
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ c995342ead51e08f8ed1155de4218d30a00d86d2:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[2:{3,4,5} 22:{3,4,5} 40:{3,5,6} 44:{3,4,5} 100:{3,5,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[2:{3,4,5} 22:{3,4,5} 40:{3,5,6} 44:{3,4,5} 100:{3,5,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3250445-1627885444-69-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: dead (exit status 137)
        6: 10662
        5: 10721
        4: 10664
        2: 10628
        1: 11074
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 1c46e1cd4e5be986bf9d13799bb7e13ddc896ed2:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[1:{3,4,5} 7:{3,5,6} 10:{3,4,6} 13:{3,5,6} 15:{3,5,6} 25:{3,4,6} 26:{3,4,6} 31:{3,5,6} 32:{3,5,6} 33:{3,4,5} 35:{3,4,6} 37:{3,5,6} 50:{3,4,5} 51:{3,5,6} 54:{3,4,6} 72:{3,5,6} 76:{3,4,6} 80:{3,5,6} 90:{3,4,6} 96:{3,4,6} 123:{3,4,5} 130:{3,4,5} 131:{3,4,6} 139:{3,5,6} 140:{3,4,5} 142:{3,4,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[1:{3,4,5} 7:{3,5,6} 10:{3,4,6} 13:{3,5,6} 15:{3,5,6} 25:{3,4,6} 26:{3,4,6} 31:{3,5,6} 32:{3,5,6} 33:{3,4,5} 35:{3,4,6} 37:{3,5,6} 50:{3,4,5} 51:{3,5,6} 54:{3,4,6} 72:{3,5,6} 76:{3,4,6} 80:{3,5,6} 90:{3,4,6} 96:{3,4,6} 123:{3,4,5} 130:{3,4,5} 131:{3,4,6} 139:{3,5,6} 140:{3,4,5} 142:{3,4,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3254514-1627971576-62-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: dead (exit status 137)
        4: 11082
        2: 10898
        1: 11978
        5: 11084
        6: 11071
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ eef03a46f2e43ff70485dadf7d9ad445db05cab4:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[9:{3,4,6} 14:{3,4,5} 19:{3,5,6} 26:{3,5,6} 28:{3,4,6} 29:{3,4,6} 31:{3,4,6} 34:{3,5,6} 39:{3,4,5} 64:{3,5,6} 65:{3,4,5} 85:{3,4,5} 88:{3,4,5} 95:{3,5,6} 98:{3,5,6} 111:{3,4,6} 112:{3,4,5} 122:{3,4,6} 133:{3,5,6} 135:{3,4,5} 136:{3,4,5} 140:{3,5,6} 153:{3,5,6} 159:{3,4,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[9:{3,4,6} 14:{3,4,5} 19:{3,5,6} 26:{3,5,6} 28:{3,4,6} 29:{3,4,6} 31:{3,4,6} 34:{3,5,6} 39:{3,4,5} 64:{3,5,6} 65:{3,4,5} 85:{3,4,5} 88:{3,4,5} 95:{3,5,6} 98:{3,5,6} 111:{3,4,6} 112:{3,4,5} 122:{3,4,6} 133:{3,5,6} 135:{3,4,5} 136:{3,4,5} 140:{3,5,6} 153:{3,5,6} 159:{3,4,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3259029-1628057587-61-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 6: 11403
        1: 11443
        5: 11375
        3: dead (exit status 137)
        2: 11582
        4: 11808
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 6b8d59327add74cf1342345fb3eaffc3a3e765d2:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[4:{3,6,8} 15:{3,7,8} 19:{3,7,8} 39:{3,6,8} 42:{3,7,8} 47:{3,6,7} 58:{3,6,7} 64:{3,6,7} 65:{3,7,8} 67:{3,6,8} 68:{3,6,7} 72:{3,6,7} 74:{3,7,8} 80:{3,6,7} 82:{3,7,8} 86:{3,6,8} 87:{3,6,7} 94:{3,6,8} 97:{3,7,8} 98:{3,6,7} 105:{3,6,8} 106:{3,6,7} 113:{3,6,8} 117:{3,6,7} 124:{3,6,7} 126:{3,6,8} 133:{3,7,8} 135:{3,6,8} 144:{3,6,7} 146:{3,6,7} 154:{3,6,7}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[4:{3,6,8} 15:{3,7,8} 19:{3,7,8} 39:{3,6,8} 42:{3,7,8} 47:{3,6,7} 58:{3,6,7} 64:{3,6,7} 65:{3,7,8} 67:{3,6,8} 68:{3,6,7} 72:{3,6,7} 74:{3,7,8} 80:{3,6,7} 82:{3,7,8} 86:{3,6,8} 87:{3,6,7} 94:{3,6,8} 97:{3,7,8} 98:{3,6,7} 105:{3,6,8} 106:{3,6,7} 113:{3,6,8} 117:{3,6,7} 124:{3,6,7} 126:{3,6,8} 133:{3,7,8} 135:{3,6,8} 144:{3,6,7} 146:{3,6,7} 154:{3,6,7}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3264559-1628144240-61-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 1: 11538
        5: 11366
        2: 11881
        4: 12305
        3: dead (exit status 137)
        6: 11463
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 50ef2fc205baa65c5a740c2d614fe1de279367e9:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[8:{3,6,8} 9:{3,6,8} 10:{3,6,7} 16:{3,6,8} 18:{3,6,7} 24:{3,7,8} 28:{3,7,8} 31:{3,6,7} 32:{3,6,7} 35:{3,6,8} 41:{3,7,8} 53:{3,7,8} 57:{3,7,8} 58:{3,6,7} 60:{3,6,8} 62:{3,6,7} 67:{3,6,8} 68:{3,6,7} 72:{3,6,7} 75:{3,6,7} 76:{3,6,7} 77:{3,7,8} 80:{3,6,7} 86:{3,6,8} 88:{3,6,8} 99:{3,7,8} 102:{3,6,8} 103:{3,6,7} 109:{3,6,7} 115:{3,7,8} 120:{3,7,8} 121:{3,7,8} 123:{3,6,8} 133:{3,6,8} 137:{3,7,8}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[8:{3,6,8} 9:{3,6,8} 10:{3,6,7} 16:{3,6,8} 18:{3,6,7} 24:{3,7,8} 28:{3,7,8} 31:{3,6,7} 32:{3,6,7} 35:{3,6,8} 41:{3,7,8} 53:{3,7,8} 57:{3,7,8} 58:{3,6,7} 60:{3,6,8} 62:{3,6,7} 67:{3,6,8} 68:{3,6,7} 72:{3,6,7} 75:{3,6,7} 76:{3,6,7} 77:{3,7,8} 80:{3,6,7} 86:{3,6,8} 88:{3,6,8} 99:{3,7,8} 102:{3,6,8} 103:{3,6,7} 109:{3,6,7} 115:{3,7,8} 120:{3,7,8} 121:{3,7,8} 123:{3,6,8} 133:{3,6,8} 137:{3,7,8}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3269276-1628230445-62-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 1: 11718
        6: 11567
        2: 11901
        3: dead (exit status 137)
        4: 11709
        5: 11635
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ cab185ff71f0924953d987fe6ffd14efdd32a3a0:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[1:{3,5,6} 16:{3,4,5} 18:{3,4,5} 19:{3,5,6} 22:{3,5,6} 24:{3,4,5} 34:{3,5,6} 36:{3,4,6} 37:{3,4,6} 39:{3,4,6} 41:{3,4,5} 46:{3,4,6} 47:{3,5,6} 50:{3,5,6} 51:{3,5,6} 53:{3,4,5} 54:{3,5,6} 58:{3,5,6} 62:{3,4,5} 68:{3,5,6} 70:{3,4,5} 75:{3,4,5} 76:{3,5,6} 77:{3,4,6} 80:{3,4,6} 84:{3,5,6} 92:{3,5,6} 93:{3,5,6} 95:{3,4,5} 100:{3,4,6} 101:{3,4,6} 103:{3,4,6} 104:{3,4,5} 107:{3,5,6} 108:{3,5,6} 111:{3,5,6} 112:{3,5,6} 115:{3,4,6} 116:{3,4,6} 119:{3,5,6} 120:{3,5,6} 123:{3,4,6} 136:{3,4,6} 137:{3,4,6} 138:{3,4,5} 141:{3,4,5}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[1:{3,5,6} 16:{3,4,5} 18:{3,4,5} 19:{3,5,6} 22:{3,5,6} 24:{3,4,5} 34:{3,5,6} 36:{3,4,6} 37:{3,4,6} 39:{3,4,6} 41:{3,4,5} 46:{3,4,6} 47:{3,5,6} 50:{3,5,6} 51:{3,5,6} 53:{3,4,5} 54:{3,5,6} 58:{3,5,6} 62:{3,4,5} 68:{3,5,6} 70:{3,4,5} 75:{3,4,5} 76:{3,5,6} 77:{3,4,6} 80:{3,4,6} 84:{3,5,6} 92:{3,5,6} 93:{3,5,6} 95:{3,4,5} 100:{3,4,6} 101:{3,4,6} 103:{3,4,6} 104:{3,4,5} 107:{3,5,6} 108:{3,5,6} 111:{3,5,6} 112:{3,5,6} 115:{3,4,6} 116:{3,4,6} 119:{3,5,6} 120:{3,5,6} 123:{3,4,6} 136:{3,4,6} 137:{3,4,6} 138:{3,4,5} 141:{3,4,5}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3273416-1628317362-65-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 1: 12287
        4: 12163
        3: dead (exit status 137)
        2: 11923
        6: 12283
        5: 12059
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 62ec88c61edcaa023a579199cc5b43d3ee951cef:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[12:{3,5,6} 19:{3,4,6} 34:{3,5,6} 40:{3,5,6} 43:{3,5,6} 45:{3,4,6} 58:{3,4,5} 65:{3,4,6} 72:{3,5,6} 78:{3,5,6} 92:{3,4,5} 95:{3,5,6} 110:{3,4,5} 113:{3,5,6} 118:{3,4,5} 119:{3,4,6} 121:{3,5,6} 123:{3,5,6} 129:{3,4,5} 130:{3,5,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[12:{3,5,6} 19:{3,4,6} 34:{3,5,6} 40:{3,5,6} 43:{3,5,6} 45:{3,4,6} 58:{3,4,5} 65:{3,4,6} 72:{3,5,6} 78:{3,5,6} 92:{3,4,5} 95:{3,5,6} 110:{3,4,5} 113:{3,5,6} 118:{3,4,5} 119:{3,4,6} 121:{3,5,6} 123:{3,5,6} 129:{3,4,5} 130:{3,5,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3275637-1628489663-60-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: dead (exit status 137)
        2: 11867
        5: 11818
        4: 11745
        1: 12371
        6: 11883
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 847514dab6354d4cc4ccf7b2857487b32119fb37:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[5:{3,6,8} 8:{3,6,7} 13:{3,7,8} 22:{3,7,8} 33:{3,6,8} 38:{3,7,8} 46:{3,6,7} 50:{3,6,8} 51:{3,6,8} 56:{3,6,8} 58:{3,6,8} 62:{3,6,8} 67:{3,7,8} 68:{3,6,8} 71:{3,6,7} 79:{3,6,8} 83:{3,6,8} 85:{3,6,8} 91:{3,7,8} 92:{3,6,8} 96:{3,7,8} 108:{3,6,7} 116:{3,6,7} 117:{3,7,8} 118:{3,6,8} 120:{3,6,7} 122:{3,7,8} 127:{3,6,7} 135:{3,6,7}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[5:{3,6,8} 8:{3,6,7} 13:{3,7,8} 22:{3,7,8} 33:{3,6,8} 38:{3,7,8} 46:{3,6,7} 50:{3,6,8} 51:{3,6,8} 56:{3,6,8} 58:{3,6,8} 62:{3,6,8} 67:{3,7,8} 68:{3,6,8} 71:{3,6,7} 79:{3,6,8} 83:{3,6,8} 85:{3,6,8} 91:{3,7,8} 92:{3,6,8} 96:{3,7,8} 108:{3,6,7} 116:{3,6,7} 117:{3,7,8} 118:{3,6,8} 120:{3,6,7} 122:{3,7,8} 127:{3,6,7} 135:{3,6,7}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3281225-1628577374-61-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 2: 12124
        6: 12279
        1: 12224
        3: dead (exit status 137)
        5: 11976
        4: 11516
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1171
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:279
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2094
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 3928f1963833fbf51ae47bd2a42ae6a200ebbb14:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[8:{3,4,6} 15:{3,5,6} 19:{3,5,6} 29:{3,4,5} 35:{3,4,5} 41:{3,5,6} 46:{3,4,5} 48:{3,4,5} 67:{3,4,5} 75:{3,4,5} 79:{3,4,5} 81:{3,4,6} 84:{3,5,6} 87:{3,4,5} 88:{3,5,6} 89:{3,4,5} 92:{3,5,6} 100:{3,4,5} 101:{3,5,6} 106:{3,4,5} 115:{3,5,6} 118:{3,4,5} 121:{3,4,6} 126:{3,4,6} 127:{3,4,5} 134:{3,4,6} 135:{3,4,5} 153:{3,4,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[8:{3,4,6} 15:{3,5,6} 19:{3,5,6} 29:{3,4,5} 35:{3,4,5} 41:{3,5,6} 46:{3,4,5} 48:{3,4,5} 67:{3,4,5} 75:{3,4,5} 79:{3,4,5} 81:{3,4,6} 84:{3,5,6} 87:{3,4,5} 88:{3,5,6} 89:{3,4,5} 92:{3,5,6} 100:{3,4,5} 101:{3,5,6} 106:{3,4,5} 115:{3,5,6} 118:{3,4,5} 121:{3,4,6} 126:{3,4,6} 127:{3,4,5} 134:{3,4,6} 135:{3,4,5} 153:{3,4,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3296712-1628835955-63-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: dead (exit status 137)
        5: 11804
        1: 11858
        4: 11889
        6: 11979
        2: 12923
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 0880e83e30ee5eb9aab7bb2297324e098d028225:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[7:{3,5,6} 9:{3,4,5} 11:{3,4,5} 16:{3,4,5} 25:{3,4,6} 27:{3,5,6} 37:{3,5,6} 48:{3,5,6} 52:{3,4,5} 56:{3,5,6} 57:{3,4,6} 60:{3,5,6} 62:{3,5,6} 63:{3,5,6} 86:{3,5,6} 98:{3,4,5} 103:{3,4,6} 108:{3,4,5} 127:{3,4,5} 133:{3,4,6} 157:{3,4,5}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[7:{3,5,6} 9:{3,4,5} 11:{3,4,5} 16:{3,4,5} 25:{3,4,6} 27:{3,5,6} 37:{3,5,6} 48:{3,5,6} 52:{3,4,5} 56:{3,5,6} 57:{3,4,6} 60:{3,5,6} 62:{3,5,6} 63:{3,5,6} 86:{3,5,6} 98:{3,4,5} 103:{3,4,6} 108:{3,4,5} 127:{3,4,5} 133:{3,4,6} 157:{3,4,5}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3301517-1628922680-62-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 6: 11966
        1: 11701
        3: dead (exit status 137)
        5: 12309
        2: 11957
        4: 11392
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ ee3efd6b1e24a3e1676778f5028fa0a35266f683:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[2:{3,4,5} 3:{3,4,6} 6:{3,5,6} 12:{3,4,6} 13:{3,4,6} 15:{3,4,5} 16:{3,4,5} 30:{3,4,6} 37:{3,4,6} 43:{3,5,6} 55:{3,4,5} 57:{3,4,6} 59:{3,4,6} 69:{3,4,5} 75:{3,5,6} 85:{3,4,6} 95:{3,4,6} 107:{3,4,5} 111:{3,5,6} 116:{3,4,5} 122:{3,5,6} 142:{3,4,5}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[2:{3,4,5} 3:{3,4,6} 6:{3,5,6} 12:{3,4,6} 13:{3,4,6} 15:{3,4,5} 16:{3,4,5} 30:{3,4,6} 37:{3,4,6} 43:{3,5,6} 55:{3,4,5} 57:{3,4,6} 59:{3,4,6} 69:{3,4,5} 75:{3,5,6} 85:{3,4,6} 95:{3,4,6} 107:{3,4,5} 111:{3,5,6} 116:{3,4,5} 122:{3,5,6} 142:{3,4,5}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3303319-1629008692-61-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 6: 11777
        3: dead (exit status 137)
        5: 12125
        1: 12403
        2: 12129
        4: 12227
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ 04a41e7915f4a89dcc1d0dbd92466c6adf79ec9f:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[3:{3,4,5} 13:{3,4,6} 15:{3,4,6} 19:{3,4,6} 21:{3,4,6} 23:{3,5,6} 29:{3,5,6} 30:{3,5,6} 39:{3,5,6} 40:{3,4,5} 41:{3,5,6} 47:{3,5,6} 52:{3,5,6} 55:{3,4,5} 60:{3,4,6} 63:{3,4,6} 67:{3,5,6} 68:{3,4,6} 73:{3,4,6} 79:{3,4,5} 81:{3,4,6} 83:{3,4,6} 84:{3,4,6} 88:{3,4,5} 95:{3,5,6} 97:{3,4,6} 110:{3,5,6} 112:{3,4,5} 116:{3,4,5} 117:{3,5,6} 118:{3,5,6} 119:{3,4,5} 124:{3,4,5} 127:{3,4,6} 133:{3,4,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[3:{3,4,5} 13:{3,4,6} 15:{3,4,6} 19:{3,4,6} 21:{3,4,6} 23:{3,5,6} 29:{3,5,6} 30:{3,5,6} 39:{3,5,6} 40:{3,4,5} 41:{3,5,6} 47:{3,5,6} 52:{3,5,6} 55:{3,4,5} 60:{3,4,6} 63:{3,4,6} 67:{3,5,6} 68:{3,4,6} 73:{3,4,6} 79:{3,4,5} 81:{3,4,6} 83:{3,4,6} 84:{3,4,6} 88:{3,4,5} 95:{3,5,6} 97:{3,4,6} 110:{3,5,6} 112:{3,4,5} 116:{3,4,5} 117:{3,5,6} 118:{3,5,6} 119:{3,4,5} 124:{3,4,5} 127:{3,4,6} 133:{3,4,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:867: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3305136-1629094127-63-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 2: 11679
        5: 11828
        3: dead (exit status 137)
        1: 12262
        6: 11734
        4: 11941
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

cockroach-teamcity commented 3 years ago

roachtest.replicagc-changed-peers/restart=true failed with artifacts on master @ dd82053908203cf6d77c36c06a8280831bb93d57:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/replicagc-changed-peers/restart=true/run_1
    replicagc.go:288,replicagc.go:102,replicagc.go:36,test_runner.go:777: ranges remained on n3 (according to meta2): map[31:{3,5,6} 39:{3,5,6} 47:{3,5,6} 67:{3,4,6} 88:{3,5,6} 97:{3,5,6} 102:{3,4,6} 128:{3,4,6} 144:{3,5,6}]
        (1) attached stack trace
          -- stack trace:
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:286
          | github.com/cockroachdb/cockroach/pkg/util/retry.ForDuration
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:197
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.waitForZeroReplicasOnN3
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:265
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runReplicaGCChangedPeers
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:102
          | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerReplicaGC.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/replicagc.go:36
          | main.(*testRunner).runTest.func2
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (2) ranges remained on n3 (according to meta2): map[31:{3,5,6} 39:{3,5,6} 47:{3,5,6} 67:{3,4,6} 88:{3,5,6} 97:{3,5,6} 102:{3,4,6} 128:{3,4,6} 144:{3,5,6}]
        Error types: (1) *withstack.withStack (2) *errutil.leafError

    cluster.go:1245,context.go:89,cluster.go:1233,test_runner.go:866: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3311212-1629181178-65-n6cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: 11410
        6: 12151
        3: dead (exit status 137)
        2: 11949
        1: 11990
        5: 12220
        Error: UNCLASSIFIED_PROBLEM: 3: dead (exit status 137)
        (1) UNCLASSIFIED_PROBLEM
        Wraps: (2) attached stack trace
          -- stack trace:
          | main.glob..func14
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
          | main.wrap.func1
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
          | github.com/spf13/cobra.(*Command).execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
          | github.com/spf13/cobra.(*Command).ExecuteC
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
          | github.com/spf13/cobra.(*Command).Execute
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
          | main.main
          |     /home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
          | runtime.main
          |     /usr/local/go/src/runtime/proc.go:225
          | runtime.goexit
          |     /usr/local/go/src/runtime/asm_amd64.s:1371
        Wraps: (3) 3: dead (exit status 137)
        Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: [roachtest README](https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest) See: [CI job to stress roachtests](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestStress)

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^replicagc-changed-peers/restart=true$` * Parameters / `env.COUNT`: <number of runs>

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

tbg commented 3 years ago

Going to run a roachstress build (https://teamcity.cockroachdb.com/viewQueued.html?itemId=3313313&tab=queuedBuildOverviewTab) on https://github.com/cockroachdb/cockroach/pull/67916 itself. That landed 22 days ago, but the failures here and in the restart=false sibling only started 18 days ago (plus I tested the original commit and did not see this issue). Perhaps roachtest didn't run, perhaps something else broke this test. The failure in this test is that replicas don't move off n3 within five minutes, which is unexpected to me; n3 is decommissioning and the node is not running, but there are five other nodes around. The replicas that get stuck on n3 are all vanilla table ranges.

tbg commented 3 years ago

That didn't work since the roachstress script the job uses wasn't checked in yet, I'll try with the PR that introduces it instead: https://teamcity.cockroachdb.com/viewLog.html?buildId=3313386&buildTypeId=Cockroach_Nightlies_RoachtestStress

tbg commented 3 years ago

Didn't work too:


stderr: ERROR: (gcloud.compute.instances.list) Some requests did not succeed:
 - Required 'compute.instances.list' permission for 'projects/andrei-jepsen'
Error: UNCLASSIFIED_PROBLEM: failed to run: gcloud compute instances list --project andrei-jepsen --format json: exit status 1
(1) UNCLASSIFIED_PROBLEM
``

I'll have to use cockroach-ephemeral I guess.

Will put this on ice for now, and use `roachstress.sh` when I get back to this.
erikgrinaker commented 3 years ago

Must've changed the service account permissions since I set the stress job up. I opened a dev-inf ticket to get a new project for these: https://cockroachlabs.atlassian.net/browse/DEVINF-140.

tbg commented 3 years ago

Long fixed on master (which is this branch)