Closed JaySon-Huang closed 1 month ago
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: Lloyd-Pottiger, xzhangxian1008
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Timeline:
2024-05-28 07:09:20.493150242 +0000 UTC m=+2760314.250285814
: :ballot_box_with_check: agreed by xzhangxian1008.2024-05-28 08:55:50.739988071 +0000 UTC m=+2766704.497123644
: :ballot_box_with_check: agreed by Lloyd-Pottiger.
What problem does this PR solve?
Issue Number: close https://github.com/pingcap/tiflash/issues/9092, close https://github.com/pingcap/tiflash/issues/9097
Problem Summary:
For #9092
TMTContext will start a thread "MPPTask-Moniter" for running
checkLongLiveMPPTasks
. https://github.com/pingcap/tiflash/blob/38ab3f912220b94962043bde7cc341017d569994/dbms/src/Storages/KVStore/TMTContext.cpp#L121-L126When the TiFlash is shutting down, the thread is not explicitly stopped. And the thread may live longer than the TiFlashMetrics instance. If the TiFlashMetrics instance is released before
checkLongLiveMPPTasks
run, thencheckLongLiveMPPTasks
will access to a random address and cause use-after-free data race when shutting down.For #9097 Seems the race is reported in
backtrace-rs
, there is nothing we can do in tiflash code, just ignoreWhat is changed and how it works?
For #9092 In
TMTContext::shutdown
, set theMPPTaskMonitor->is_shutdown = true
. So the thread is expected to be stopped after TMTContext::shutdown is called and beforeTiFlashMetrics
is release. And whenmonitor->is_shutdown == true
, the thread don't report the metircs toTiFlashMetrics
For #9097 Add
race:StackTrace::toString, race:DB::SyncPointCtl::sync
to tsan.suppression. And they will be ignored when running withTSAN_OPTIONS="suppressions=/tests/sanitize/tsan.suppression" ./dbms/gtests_dbms --gtest_filter=...
Check List
Tests
./dbms/gtests_dbms