pingcap / tiflash

The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
https://docs.pingcap.com/tidb/stable/tiflash-overview
Apache License 2.0
941 stars 410 forks source link

multi-thread safe getGCSafePointWithRetry #4928

Open breezewish opened 2 years ago

breezewish commented 2 years ago

Enhancement

Currently there is a race condition in getGCSafePointWithRetry which makes it buggy. The getGCSafePointWithRetry are now called in a least 3 kind of threads:

breezewish commented 2 years ago

We could have two stages to improve this behavior:

Stage 1. Use atomic. This avoids the memory data race, but parallel calls to this function will lead to multiple parallel PD requests.

Stage 2. To be discussed. May be a typical "Single Flight" scenario. We can use similar patterns so that parallel calls (when cache is expired) will not cause multiple PD requests, but wait for the first request instead.