cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.07k stars 3.8k forks source link

ccl/streamingccl/streamingest: TestDataDriven failed #107930

Closed cockroach-teamcity closed 1 year ago

cockroach-teamcity commented 1 year ago

ccl/streamingccl/streamingest.TestDataDriven failed with artifacts on release-23.1 @ f6c68f6626497c43f2e5bef6f7e189b8792cfefb:

      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0xf4c
  github.com/cockroachdb/cockroach/pkg/server.(*channelOrchestrator).startControlledServer()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:292 +0x29
  github.com/cockroachdb/cockroach/pkg/server.(*serverController).createServerEntryLocked()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:173 +0x2b0
  github.com/cockroachdb/cockroach/pkg/server.(*serverController).scanTenantsForRunnableServices()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:134 +0x538
  github.com/cockroachdb/cockroach/pkg/server.(*serverController).start.func1()
      github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:60 +0x21a
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6

Goroutine 147620 (running) created at:
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0x404
  github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith()
      github.com/cockroachdb/cockroach/pkg/util/netutil/net.go:185 +0x36
  github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1()
      github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1756 +0x17b
  github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
      github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
==================
=== RUN   TestDataDriven/alter_tenant
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:1:
        create-replication-clusters [0 args]
        <no input to command>
        ----
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:4:
        start-replication-stream [0 args]
        <no input to command>
        ----
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:7:
        exec-sql [1 args]
        ALTER TENANT "destination" SET REPLICATION RETENTION = '42s'
        ----
    datadriven_test.go:102: 
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4301/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/streamingccl/streamingest/streamingest_test_/streamingest_test.runfiles/com_github_cockroachdb_cockroach/pkg/ccl/streamingccl/streamingest/testdata/alter_tenant:11:
        query-sql [1 args]
        SELECT crdb_internal.pb_to_json('payload', payload)->'streamIngestion'->'replicationTtlSeconds' as retention_ttl_seconds
        FROM crdb_internal.system_jobs
        WHERE id = (SELECT replication_job_id FROM [SHOW TENANT "destination" WITH REPLICATION STATUS])
        ----
        42

Parameters: TAGS=bazel,gss,race

Help

See also: [How To Investigate a Go Test Failure \(internal\)](https://cockroachlabs.atlassian.net/l/c/HgfXfJgM)

/cc @cockroachdb/disaster-recovery

This test on roachdash | Improve this report!

Jira issue: CRDB-30261

stevendanna commented 1 year ago

@lidorcarmel I'm throwing this your way since you've thought about this a bit. Happy to look into it with you though.

(cc @knz for visibility)

lidorcarmel commented 1 year ago

See stack below. I'm thinking just to put a mutex around serverStateUsingChannels.server, I'll send a pr tomorrow.

    ==================
07:39:31     WARNING: DATA RACE
07:39:31     Write at 0x00c003a18828 by goroutine 147262:
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*channelOrchestrator).startControlledServer.func5()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:416 +0xa64
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     
07:39:31     Previous read at 0x00c003a18828 by goroutine 147620:
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverStateUsingChannels).getServer()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:109 +0x4e
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).getServer()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_accessors.go:28 +0x138
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).sqlMux()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_sql.go:67 +0x2ac
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).sqlMux-fm()
07:39:31           <autogenerated>:1 +0xc7
07:39:31       github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1.1()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1766 +0x2eb
07:39:31       github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith.func1()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/netutil/net.go:188 +0x111
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     
07:39:31     Goroutine 147262 (running) created at:
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0xf4c
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*channelOrchestrator).startControlledServer()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_channel_orchestrator.go:292 +0x29
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).createServerEntryLocked()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:173 +0x2b0
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).scanTenantsForRunnableServices()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:134 +0x538
07:39:31       github.com/cockroachdb/cockroach/pkg/server.(*serverController).start.func1()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_controller_orchestration.go:60 +0x21a
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     
07:39:31     Goroutine 147620 (running) created at:
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:461 +0x619
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:332 +0x404
07:39:31       github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/netutil/net.go:185 +0x36
07:39:31       github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1()
07:39:31           github.com/cockroachdb/cockroach/pkg/server/server_sql.go:1756 +0x17b
07:39:31       github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
07:39:31           github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:470 +0x1f6
07:39:31     ==================