pingcap / tiflow

This repo maintains DM (a data migration platform) and TiCDC (change data capture for TiDB)
Apache License 2.0
430 stars 287 forks source link

CDC panic: unlocking a not locked range #10151

Open fubinzh opened 12 months ago

fubinzh commented 12 months ago

What did you do?

  1. Deploy TiDB cluster with 6 TiKV and 2 CDC
  2. Stop GC
  3. Create blackhole changefeed and pause it
  4. Run workload for several days to generate hube incremetal data for cdc t sync
  5. Resume cdc changefeed to do initial scan

What did you expect to see?

CDC not panic

What did you see instead?

cdc panic seen

panic: unlocking a not locked range

goroutine 38499263 [running]:
go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x6?, 0x6?, {0x0?, 0x0?, 0xc2508c6320?})
        go.uber.org/zap@v1.23.0/zapcore/entry.go:198 +0x65
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc1acd7a5b0, {0xc161f0a900, 0x6, 0x6})
        go.uber.org/zap@v1.23.0/zapcore/entry.go:264 +0x3ec
go.uber.org/zap.(*Logger).Panic(0xc161f0a900?, {0x3a2c917?, 0x456809?}, {0xc161f0a900, 0x6, 0x6})
        go.uber.org/zap@v1.23.0/logger.go:251 +0x59
github.com/pingcap/log.Panic({0x3a2c917?, 0xc161f32460?}, {0xc161f0a900?, 0x35a1c80?, 0xc1fcc157d0?})
        github.com/pingcap/log@v1.1.1-0.20221116035753-734d527bc87c/global.go:54 +0x8b
github.com/pingcap/tiflow/pkg/regionspan.(*RegionRangeLock).UnlockRange(0xc0cbb6e740, {0xc1758e08c0, 0x36, 0x40}, {0xc1758e09c0, 0x36, 0x40}, 0xd793a, 0x82d, 0x62fc0ff82000000)
        github.com/pingcap/tiflow/pkg/regionspan/region_range_lock.go:344 +0x74b
github.com/pingcap/tiflow/cdc/kv.(*eventFeedSession).onRegionFail(0xc0328ba9a0, {0x42f9b68, 0xc0cbb6e880}, {{{0xd793a, 0x11, 0x82d}, {{0xc1758e08c0, 0x36, 0x40}, {0xc1758e09c0, ...}}, ...}, ...})
        github.com/pingcap/tiflow/cdc/kv/client.go:553 +0x9f
github.com/pingcap/tiflow/cdc/kv.(*regionWorker).handleSingleRegionError(0xc232ebb9e0, {0x42c9b80?, 0xc1fcc157d0}, 0xc29e8d9560)
        github.com/pingcap/tiflow/cdc/kv/region_worker.go:234 +0x858
github.com/pingcap/tiflow/cdc/kv.(*regionWorker).processEvent(0xc232ebb9e0, {0x42f9b68, 0xc2309a0000}, 0xc17ea17bf0)
        github.com/pingcap/tiflow/cdc/kv/region_worker.go:376 +0x497
github.com/pingcap/tiflow/cdc/kv.(*regionWorker).eventHandler(0xc232ebb9e0, {0x42f9b68, 0xc2309a0000})
        github.com/pingcap/tiflow/cdc/kv/region_worker.go:512 +0x5ed
github.com/pingcap/tiflow/cdc/kv.(*regionWorker).run.func4()
        github.com/pingcap/tiflow/cdc/kv/region_worker.go:603 +0x2e
golang.org/x/sync/errgroup.(*Group).Go.func1()
        golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75 +0x64
created by golang.org/x/sync/errgroup.(*Group).Go
        golang.org/x/sync@v0.1.0/errgroup/errgroup.go:72 +0xa5

Versions of the cluster

Upstream TiKV version (execute tikv-server --version):

/ # /tikv-server -V
TiKV
Release Version:   6.5.3
Edition:           Community
Git Commit Hash:   0578b41fca52aeee5ea5708efd41275165242988
Git Commit Branch: heads/refs/tags/v6.5.3-pr16054-0578b4
UTC Build Time:    2023-11-23 08:19:34
Rust Version:      rustc 1.67.0-nightly (96ddd32c4 2022-11-14)
Enable Features:   pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine cloud-aws cloud-gcp cloud-azure
Profile:           dist_release

TiCDC version (execute cdc version):

bash-5.1# /cdc version
Release Version: v6.5.3-20231116-255d810
Git Commit Hash: 255d8104c8b9b4f8af191ee69d301ced723ccc3d
Git Branch: heads/refs/tags/v6.5.3-20231116-255d810
UTC Build Time: 2023-11-16 06:35:28
Go Version: go version go1.19.12 linux/amd64
Failpoint Build: false
fubinzh commented 12 months ago

/severity major

nongfushanquan commented 11 months ago

/assign @sdojjy

3AceShowHand commented 3 months ago

Does not affect 8.1, since this code is deprecated and removed