pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.89k stars 5.81k forks source link

DATA RACE session.(*LazyTxn).Valid() #54513

Closed tiancaiamao closed 2 months ago

tiancaiamao commented 2 months ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Found in our CI https://do.pingcap.net/jenkins/blue/organizations/jenkins/pingcap%2Ftidb%2Fghpr_check2/detail/ghpr_check2/12427/pipeline/514/

Here is the DATA RACE stack:

WARNING: DATA RACE
Read at 0x00c00577ec88 by goroutine 60342:
  github.com/pingcap/tidb/pkg/session.(*LazyTxn).Valid()
      pkg/session/txn.go:227 +0x39
  github.com/pingcap/tidb/pkg/testkit.(*MockSessionManager).GetInternalSessionStartTSList()
      pkg/testkit/mocksessionmanager.go:164 +0x2e2
  github.com/pingcap/tidb/pkg/domain/infosync.(*InfoSyncer).ReportMinStartTS()
      pkg/domain/infosync/info.go:756 +0xb9
  github.com/pingcap/tidb/pkg/domain.(*Domain).infoSyncerKeeper()
      pkg/domain/domain.go:793 +0x53a
  github.com/pingcap/tidb/pkg/domain.(*Domain).infoSyncerKeeper-fm()
      <autogenerated>:1 +0x33
  github.com/pingcap/tidb/pkg/util.(*WaitGroupEnhancedWrapper).Run.func1()
      pkg/util/wait_group_wrapper.go:99 +0xf4

Previous write at 0x00c00577ec88 by goroutine 61371:
  github.com/pingcap/tidb/pkg/session.(*LazyTxn).changeToPending()
      pkg/session/txn.go:284 +0x524
  github.com/pingcap/tidb/pkg/session.(*session).PrepareTSFuture()
      pkg/session/session.go:3924 +0x388
  github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).replaceTxnTsFuture()
      pkg/sessiontxn/isolation/base.go:367 +0x1c7
  github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).prepareTxn()
      pkg/sessiontxn/isolation/base.go:332 +0x217
  github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*baseTxnContextProvider).AdviseWarmup()
      pkg/sessiontxn/isolation/base.go:392 +0x97
  github.com/pingcap/tidb/pkg/sessiontxn/isolation.(*OptimisticTxnContextProvider).AdviseWarmup()
      <autogenerated>:1 +0x31
  github.com/pingcap/tidb/pkg/session.(*txnManager).AdviseWarmup()
      pkg/session/txnmanager.go:317 +0xe6
  github.com/pingcap/tidb/pkg/planner/contextimpl.(*PlanCtxExtendedImpl).AdviseTxnWarmup()
      pkg/planner/contextimpl/impl.go:55 +0x85
  github.com/pingcap/tidb/pkg/session.(*planContextImpl).AdviseTxnWarmup()
      <autogenerated>:1 +0x46
  github.com/pingcap/tidb/pkg/planner.Optimize()

      pkg/planner/optimize.go:230 +0x1025
  github.com/pingcap/tidb/pkg/executor.(*Compiler).Compile()
      pkg/executor/compiler.go:99 +0x884
  github.com/pingcap/tidb/pkg/session.(*session).ExecuteStmt()
      pkg/session/session.go:2105 +0x11dd
  github.com/pingcap/tidb/pkg/session.(*session).ExecuteInternal()
      pkg/session/session.go:1530 +0x3f7
  github.com/pingcap/tidb/pkg/util/sqlexec.ExecSQL()
      pkg/util/sqlexec/restricted_sql_executor.go:252 +0xee
  github.com/pingcap/tidb/pkg/disttask/framework/storage.(*TaskManager).ExecuteSQLWithNewSession.func1()
      pkg/disttask/framework/storage/task_table.go:183 +0xf8
  github.com/pingcap/tidb/pkg/disttask/framework/storage.(*TaskManager).WithNewSession()
      pkg/disttask/framework/storage/task_table.go:147 +0x1a1
  github.com/pingcap/tidb/pkg/disttask/framework/storage.(*TaskManager).ExecuteSQLWithNewSession()
      pkg/disttask/framework/storage/task_table.go:182 +0x184
  github.com/pingcap/tidb/pkg/disttask/framework/storage.(*TaskManager).GetTaskExecInfoByExecID()
      pkg/disttask/framework/storage/task_table.go:270 +0x259
  github.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*Manager).handleTasks()
      pkg/disttask/framework/taskexecutor/manager.go:193 +0xd5
  github.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*Manager).handleTasksLoop()
      pkg/disttask/framework/taskexecutor/manager.go:184 +0x2c4
  github.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*Manager).handleTasksLoop-fm()
      <autogenerated>:1 +0x33
  github.com/pingcap/tidb/pkg/util.(*WaitGroupWrapper).Run.func1()
      pkg/util/wait_group_wrapper.go:157 +0x9d

Goroutine 60342 (running) created at:
  github.com/pingcap/tidb/pkg/util.(*WaitGroupEnhancedWrapper).Run()
      pkg/util/wait_group_wrapper.go:94 +0x164
  github.com/pingcap/tidb/pkg/domain.(*Domain).Init()
      pkg/domain/domain.go:1366 +0x1c05
  github.com/pingcap/tidb/pkg/session.(*domainMap).Get.func1()
      pkg/session/tidb.go:92 +0x4c6
  github.com/pingcap/tidb/pkg/util.RunWithRetry()
      pkg/util/misc.go:70 +0xb1
  github.com/pingcap/tidb/pkg/session.(*domainMap).Get()
      pkg/session/tidb.go:79 +0x319
  github.com/pingcap/tidb/pkg/session.createSessionWithOpt()
      pkg/session/session.go:3696 +0x84
  github.com/pingcap/tidb/pkg/session.createSession()
      pkg/session/session.go:3688 +0x46
  github.com/pingcap/tidb/pkg/session.createSessionsImpl()
      pkg/session/session.go:3673 +0xa4
  github.com/pingcap/tidb/pkg/session.createSessions()
      pkg/session/session.go:3660 +0x56
  github.com/pingcap/tidb/pkg/session.bootstrapSessionImpl()
      pkg/session/session.go:3444 +0x5d6
  github.com/pingcap/tidb/pkg/session.BootstrapSession()
      pkg/session/session.go:3376 +0x4b
  github.com/pingcap/tidb/tests/realtikvtest.CreateMockStoreAndDomainAndSetup()
      tests/realtikvtest/testkit.go:124 +0x1d9
  github.com/pingcap/tidb/tests/realtikvtest.CreateMockStoreAndSetup()
      tests/realtikvtest/testkit.go:99 +0x253
  tests/realtikvtest/importintotest/importintotest_test.(*mockGCSSuite).SetupSuite()
      tests/realtikvtest/importintotest/util_test.go:65 +0x227
  github.com/stretchr/testify/suite.Run()
      external/com_github_stretchr_testify/suite/suite.go:157 +0x63e
  tests/realtikvtest/importintotest/importintotest_test.TestImportInto()
      tests/realtikvtest/importintotest/util_test.go:50 +0x90
  testing.tRunner()
      GOROOT/src/testing/testing.go:1595 +0x261
  testing.(*T).Run.func1()
      GOROOT/src/testing/testing.go:1648 +0x44

Goroutine 61371 (running) created at:
  github.com/pingcap/tidb/pkg/util.(*WaitGroupWrapper).Run()
      pkg/util/wait_group_wrapper.go:155 +0xf0
  github.com/pingcap/tidb/pkg/disttask/framework/taskexecutor.(*Manager).Start()
      pkg/disttask/framework/taskexecutor/manager.go:147 +0xe8
  github.com/pingcap/tidb/pkg/domain.(*Domain).distTaskFrameworkLoop()
      pkg/domain/domain.go:1594 +0x87
  github.com/pingcap/tidb/pkg/domain.(*Domain).InitDistTaskLoop.func1()
      pkg/domain/domain.go:1588 +0xeb
  github.com/pingcap/tidb/pkg/util.(*WaitGroupEnhancedWrapper).Run.func1()
      pkg/util/wait_group_wrapper.go:99 +0xf4

==================

2. What did you expect to see? (Required)

no data race

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

master

tiancaiamao commented 2 months ago

OK, this is caused by the mock session manager. We have a background infosync worker (goroutine) checking the status of all the internal sessions:

// ReportMinStartTS reports self server min start timestamp to ETCD.
func (is *InfoSyncer) ReportMinStartTS(store kv.Storage) {
    sm := is.GetSessionManager()
    if sm == nil {
        return
    }
    pl := sm.ShowProcessList()
    innerSessionStartTSList := sm.GetInternalSessionStartTSList()
        ......

When this trying to get the session txn status, it DATA RACE with the other caller who is taking a internal session out from the session pool and using it.

tiancaiamao commented 2 months ago

Fixed already by https://github.com/pingcap/tidb/pull/54403