cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.87k stars 3.77k forks source link

ccl/changefeedccl: TestNoStopAfterNonTargetAddColumnWithBackfill failed #129045

Open cockroach-teamcity opened 4 weeks ago

cockroach-teamcity commented 4 weeks ago

ccl/changefeedccl.TestNoStopAfterNonTargetAddColumnWithBackfill failed on master @ 575cdd4696dfcac8f311d1ea546683271102f73e:

test logs left over in: outputs.zip/logTestNoStopAfterNonTargetAddColumnWithBackfill2410879673
--- FAIL: TestNoStopAfterNonTargetAddColumnWithBackfill (15.18s)
=== RUN   TestNoStopAfterNonTargetAddColumnWithBackfill/sinkless
    helpers_test.go:872: making server as secondary tenant
    helpers_test.go:951: making sinkless feed factory
    helpers_test.go:1016: pgURL sinkless SinklessFeedUser
    helpers_test.go:1016: pgURL sinkless root
    testfeed_test.go:273: sinkless feed creating changefeed: CREATE CHANGEFEED FOR TABLE hasfams FAMILY b_and_c WITH schema_change_policy='stop'
    changefeed_test.go:1787: 
            Error Trace:    github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:200
                                        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:275
                                        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/changefeed_test.go:1787
                                        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:1123
                                        github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:1158
            Error:          Received unexpected error:
                            expected message
                            (1) assertion failure
                            Wraps: (2) attached stack trace
                              -- stack trace:
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.readNextMessages
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:120
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.assertPayloadsBaseErr
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:211
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.assertPayloadsBase.func1
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:203
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.withTimeout.func1
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:268
                              | github.com/cockroachdb/cockroach/pkg/util/timeutil.RunWithTimeout
                              |     github.com/cockroachdb/cockroach/pkg/util/timeutil/timeout.go:33
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.withTimeout
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:264
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.assertPayloadsBase
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:201
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.assertPayloads
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:275
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.TestNoStopAfterNonTargetAddColumnWithBackfill.func1
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/changefeed_test.go:1787
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.cdcTestNamed.func1
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:1123
                              | github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl.cdcTestNamedWithSystem.func1
                              |     github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/helpers_test.go:1158
                              | testing.tRunner
                              |     GOROOT/src/testing/testing.go:1689
                              | runtime.goexit
                              |     src/runtime/asm_amd64.s:1695
                            Wraps: (3) expected message
                            Error types: (1) *assert.withAssertionFailure (2) *withstack.withStack (3) *errutil.leafError
            Test:           TestNoStopAfterNonTargetAddColumnWithBackfill/sinkless
    testfeed_test.go:280: closing sinkless feed
    --- FAIL: TestNoStopAfterNonTargetAddColumnWithBackfill/sinkless (15.17s)

Parameters:

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/cdc

This test on roachdash | Improve this report!

Jira issue: CRDB-41352

andyyang890 commented 4 weeks ago

Reviewing the logs, it didn't hit either of the core changefeed error logs in #127530 nor the timeout log in #127553. It seems like the test server just shut down randomly with a server shutting down: instructing cmux to stop accepting message. Spot-checking a few similar past failures we had, they were all running with the secondary tenant:

Asked for help from #multi-tenant here: https://cockroachlabs.slack.com/archives/C02HWA24541/p1723839743273609

stevendanna commented 3 weeks ago

Looking at the logs from just this failure, it looks to me like the schema change stopped the feed despite our expectation that it wouldn't.

I240815 09:30:27.016476 14927345 ccl/changefeedccl/kvfeed/kv_feed.go:155 ⋮ [T10,Vcluster-10,nsql1,client=127.0.0.1:60082,hostssl,user=‹sinklessfeeduser›] 404  stopping kv feed due to schema change at 1723714222.413838743,1
andyyang890 commented 3 weeks ago

Thanks for taking a look.

My interpretation (which might be wrong) was that the changefeed was going to restart, but I guess we can't really tell from the error message since the same error is returned for both restart and exit (aside: we have an issue to improve observability for this https://github.com/cockroachdb/cockroach/issues/124635): https://github.com/cockroachdb/cockroach/blob/2f8519c1ae5020614ee1616c829e1d5b3702f942/pkg/ccl/changefeedccl/kvfeed/kv_feed.go#L405-L407

I think some other evidence that it might not be because the changefeed stopped is that I don't see the logs that were added in this PR: https://github.com/cockroachdb/cockroach/pull/127530