cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.11k stars 3.81k forks source link

roachtest: backup/mvcc-range-tombstones failed #130233

Closed cockroach-teamcity closed 1 month ago

cockroach-teamcity commented 2 months ago

roachtest.backup/mvcc-range-tombstones failed with artifacts on master @ 950a090603bffa9216f01b03aeeb7dd093c6be64:

(assertions.go:363).Fail: 
    Error Trace:    github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/backup.go:663
                                github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:177
                                github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/backup.go:660
                                github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/backup.go:771
                                github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/backup.go:579
                                main/pkg/cmd/roachtest/test_runner.go:1284
                                src/runtime/asm_amd64.s:1695
    Error:          Received unexpected error:
                    read tcp 172.17.0.3:56676->34.45.219.230:26257: read: connection reset by peer
    Test:           backup/mvcc-range-tombstones
(require.go:1357).NoError: FailNow called
test artifacts and logs in: /artifacts/backup/mvcc-range-tombstones/cpu_arch=arm64/run_1

Parameters:

See: roachtest README

See: How To Investigate (internal)

See: Grafana

/cc @cockroachdb/disaster-recovery

This test on roachdash | Improve this report!

Jira issue: CRDB-41945

msbutler commented 2 months ago

Looks like node 1 oom'd during an import:

image
msbutler commented 2 months ago

Uh oh, i wonder if there's some sort of memory leak in the buffering adder. The max buffer size per import proc is 128 MB for the pk buffer and 32 for the secondary index buffer.

stevendanna commented 2 months ago

@msbutler I wonder if the ImportEpoch work may be a villain here.

msbutler commented 2 months ago

oh i'm a doofus, import is fine. this test sets the following for unknown reasons.

 kv.bulk_ingest.max_index_buffer_size = '2gb'