pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.41k stars 5.85k forks source link

PiTR restore failed due to "failed: EpochNotMatch current epoch of region 18972 is conf_ver: 11 version: 4142, but you sent conf_ver: 11 version: 4184" #34973

Closed fubinzh closed 2 years ago

fubinzh commented 2 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Enable log backup for a cluster with workloads
  2. Run a PiTR restore
    tiup br:v6.1.0-alpha-nightly-20220525 restore point --pd 172.16.6.46:33379 --full-backup-storage "s3://nfs/fubin/pitr/pitr2_full_0524?access-key=minioadmin&secret-access-key=minioadmin&endpoint=ht
    tp%3a%2f%2fminio.pingcap.net%3a9000&force-path-style=true" --storage "s3://nfs/fubin/pitr/pitr2_log_1?access-key=minioadmin&secret-access-key=minioadmin&endpoint=http%3a%2f%2fminio.pingcap.net%3a9000&force-path-style=true"
    --restored-ts "2022-05-25 11:00:00.000 +0800"

2. What did you expect to see? (Required)

  1. PiTR restore should succeed

3. What did you see instead (Required)

  1. PiTR restore should fail
    11:34:14 root@172 fubin → tiup br:v6.1.0-alpha-nightly-20220525 restore point --pd 172.16.6.46:33379 --full-backup-storage "s3://nfs/fubin/pitr/pitr2_full_0524?access-key=minioadmin&secret-access-key=minioadmin&endpoint=ht
    tp%3a%2f%2fminio.pingcap.net%3a9000&force-path-style=true" --storage "s3://nfs/fubin/pitr/pitr2_log_1?access-key=minioadmin&secret-access-key=minioadmin&endpoint=http%3a%2f%2fminio.pingcap.net%3a9000&force-path-style=true"
    --restored-ts "2022-05-25 11:00:00.000 +0800"
    Starting component `br`: /root/.tiup/components/br/v6.1.0-alpha-nightly-20220525/br /root/.tiup/components/br/v6.1.0-alpha-nightly-20220525/br restore point --pd 172.16.6.46:33379 --full-backup-storage s3://nfs/fubin/pitr/
    pitr2_full_0524?access-key=minioadmin&secret-access-key=minioadmin&endpoint=http%3a%2f%2fminio.pingcap.net%3a9000&force-path-style=true --storage s3://nfs/fubin/pitr/pitr2_log_1?access-key=minioadmin&secret-access-key=mini
    oadmin&endpoint=http%3a%2f%2fminio.pingcap.net%3a9000&force-path-style=true --restored-ts 2022-05-25 11:00:00.000 +0800
    Detail BR log in /tmp/br.log.2022-05-25T11.34.22+0800
    Full Restore <----------------------------------------------------------------------------------------> 100.00%
    [2022/05/25 11:58:42.328 +08:00] [INFO] [collector.go:69] ["Full Restore success summary"] [total-ranges=8654] [ranges-succeed=8654] [ranges-failed=0] [split-region=2m13.967262593s] [restore-ranges=4894] [total-take=24m19.
    622580762s] [total-kv-size=392.4GB] [average-speed=268.8MB/s] [restore-data-size(after-compressed)=214.7GB] [Size=214729041595] [BackupTS=433441911262937089] [total-kv=1576148672]
    Restore Meta Files <----------------------------------------------------------------------------------> 100.00%
    Restore KV Files <------------------------------------------------------------------------------------> 100.00%
    Error: failed to restore kv files: execute over region id:18972 start_key:"t\200\000\000\000\000\000\000\377V_i\200\000\000\000\000\377\000\000\002\003\200\000\000\000\377&\003\0315\000\000\000\000\373" end_key:"t\200\000\
    000\000\000\000\000\377V_i\200\000\000\000\000\377\000\000\002\373\000\000\000\000\373" region_epoch:<conf_ver:11 version:4184 > peers:<id:18973 store_id:1 > peers:<id:18974 store_id:5 > peers:<id:18975 store_id:6 >  faile
    d: peer is not leader for region 18972, leader may Some(id: 18973 store_id: 1); execute over region id:18972 start_key:"t\200\000\000\000\000\000\000\377V_i\200\000\000\000\000\377\000\000\002\003\200\000\000\000\377&\003\
    0315\000\000\000\000\373" end_key:"t\200\000\000\000\000\000\000\377V_i\200\000\000\000\000\377\000\000\002\373\000\000\000\000\373" region_epoch:<conf_ver:11 version:4184 > peers:<id:18973 store_id:1 > peers:<id:18974 sto
    re_id:5 > peers:<id:18975 store_id:6 >  failed: EpochNotMatch current epoch of region 18972 is conf_ver: 11 version: 4142, but you sent conf_ver: 11 version: 4184; execute over region id:18972 start_key:"t\200\000\000\000\
    000\000\000\377V_i\200\000\000\000\000\377\000\000\002\003\200\000\000\000\377&\003\0315\000\000\000\000\373" end_key:"t\200\000\000\000\000\000\000\377V_i\200\000\000\000\000\377\000\000\002\373\000\000\000\000\373" regio
    n_epoch:<conf_ver:11 version:4184 > peers:<id:18973 store_id:1 > peers:<id:18974 store_id:5 > peers:<id:18975 store_id:6 >  failed: peer is not leader for region 18972, leader may Some(id: 18973 store_id: 1); execute over
    region id:18972 start_key:"t\200\000\000\000\000\000\000\377V_i\200\000\000\000\000\377\000\000\002\003\200\000\000\000\377&\003\0315\000\000\000\000\373" end_key:"t\200\000\000\000\000\000\000\377V_i\200\000\000\000\000\3
    77\000\000\002\373\000\000\000\000\373" region_epoch:<conf_ver:11 version:4184 > peers:<id:18973 store_id:1 > peers:<id:18974 store_id:5 > peers:<id:18975 store_id:6 >  

4. What is your TiDB version? (Required)

v6.1.0-alpha-nightly-20220525

joccau commented 2 years ago

/assign