Closed D3Hunter closed 2 months ago
hopefully it's closed with the same reason as https://github.com/pingcap/tidb/issues/52884
met again on current master branch, see https://github.com/pingcap/tidb/issues/55374 too
goroutine 268459959 [chan receive, 793 minutes]:
github.com/pingcap/tidb/br/pkg/membuf.(*Limiter).Acquire(0xc01fbf19f0, 0x100000)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/br/pkg/membuf/limiter.go:56 +0x1ab
github.com/pingcap/tidb/br/pkg/membuf.(*Pool).acquire(0xc021209200)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/br/pkg/membuf/buffer.go:100 +0x28
github.com/pingcap/tidb/br/pkg/membuf.(*Buffer).addBlock(0xc0194121e0)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/br/pkg/membuf/buffer.go:302 +0x8b
github.com/pingcap/tidb/br/pkg/membuf.(*Buffer).allocBytesWithSliceLocation(0xc0194121e0, 0x19426)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/br/pkg/membuf/buffer.go:272 +0x65
github.com/pingcap/tidb/br/pkg/membuf.(*Buffer).AllocBytes(0xc0194121e0, 0xc04aca9130?)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/br/pkg/membuf/buffer.go:245 +0x29
github.com/pingcap/tidb/br/pkg/membuf.(*Buffer).AddBytes(...)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/br/pkg/membuf/buffer.go:317
github.com/pingcap/tidb/pkg/lightning/backend/external.readOneFile({0x6ce5700, 0xc04aca9130}, {0x6d08a50?, 0xc13346a420?}, {0xc075094b00, 0x38}, {0xc219c8cd68, 0x13, 0x18}, {0xc219c8cd80, ...}, ...)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/lightning/backend/external/reader.go:186 +0x570
github.com/pingcap/tidb/pkg/lightning/backend/external.readAllData.func2()
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/lightning/backend/external/reader.go:98 +0x3b8
github.com/pingcap/tidb/pkg/lightning/backend/external.readAllData.(*ErrorGroupWithRecover).Go.func3()
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/util/wait_group_wrapper.go:250 +0x58
golang.org/x/sync/errgroup.(*Group).Go.func1()
/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x56
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 268311633
/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x96
For the hotspot files, the data will occupy a lot of memories, which will exceed the memLimiter threshold (12G)
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/515082f8-c85a-48a2-90b3-aa7536db2d78_stat/1] [startOffset=85385025] [endOffset=702434139] [expectedConc=74] [concurrency=74]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/99287c3d-91f0-4535-8e62-93bac8286d79_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/0109bbcc-08c5-4310-b601-b63649ccddf6_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/a5b9f1ec-c2b0-490c-8a42-82e53dca3265_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/5295178c-09d1-47a6-bd9c-85336a7bfd38_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/6c3c2c39-3e07-4768-8a19-60dfebd49a39_stat/0] [startOffset=617049114] [endOffset=736588149] [expectedConc=15] [concurrency=15]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/6c3c2c39-3e07-4768-8a19-60dfebd49a39_stat/1] [startOffset=0] [endOffset=496371612] [expectedConc=60] [concurrency=60]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/e195166c-c860-4790-b240-d6ed5cdcb9f0_stat/1] [startOffset=85385025] [endOffset=736588149] [expectedConc=78] [concurrency=78]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/e195166c-c860-4790-b240-d6ed5cdcb9f0_stat/2] [startOffset=0] [endOffset=170770050] [expectedConc=21] [concurrency=21]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/d5bdedc6-1616-42fd-882c-066c61a590c3_stat/1] [startOffset=85385025] [endOffset=702434139] [expectedConc=74] [concurrency=74]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/e8df2a22-26c2-4be9-884f-4b1a9b1d506c_stat/0] [startOffset=658033926] [endOffset=736588149] [expectedConc=10] [concurrency=10]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/e8df2a22-26c2-4be9-884f-4b1a9b1d506c_stat/1] [startOffset=0] [endOffset=496371612] [expectedConc=60] [concurrency=60]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/3d5e396d-21ec-4848-9395-7d3336731afa_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.315 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/cb7032c9-762f-4761-96a8-d8cd3fc7609e_stat/1] [startOffset=85385025] [endOffset=508894749] [expectedConc=51] [concurrency=51]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/169d1aff-9061-4dbb-ad82-8cfff2f86d72_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/8ab7b236-e767-4d0c-8c38-0c38bf24ffcb_stat/0] [startOffset=617049114] [endOffset=736588149] [expectedConc=15] [concurrency=15]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/8ab7b236-e767-4d0c-8c38-0c38bf24ffcb_stat/1] [startOffset=0] [endOffset=291447552] [expectedConc=35] [concurrency=35]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/b2b78ac1-2b12-416d-8692-e86e84465cc7_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/ce3779a2-33de-4ff1-856a-72569111a18d_stat/1] [startOffset=85385025] [endOffset=496371612] [expectedConc=49] [concurrency=49]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/0c7db861-6164-4bb1-badd-660805e7256a_stat/1] [startOffset=85385025] [endOffset=736588149] [expectedConc=78] [concurrency=78]
[2024/08/12 21:56:31.316 +08:00] [Info] [engine.go:248] ["found hotspot file in getFilesReadConcurrency"] [filename=60004/120054/data/0c7db861-6164-4bb1-badd-660805e7256a_stat/2] [startOffset=0] [endOffset=170770050] [expectedConc=21] [concurrency=21]
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
run import with global sort and 32 thread on 32c64g node, on ingest step, some subtask stuck at pool limiter stack: stuck-stack.log
2. What did you expect to see? (Required)
import success or fail
3. What did you see instead (Required)
stuck
4. What is your TiDB version? (Required)
master