Closed cosven closed 3 years ago
pd.log goroutine-pd.txt goroutine-tidb.txt
TiDB and TiKV log files are too big to upload.
It seems these versions are totally different, I do not sure if they can work together. However I find some related log in pd
[WARN] [util.go:79] ["PD version less than cluster version, please upgrade PD"] [PD-version=4.0.0-rc.2-411-gb6d036bf] [cluster-version=4.1.0-alpha]
I think the pd can not work may caused by this, that the pd couldn't campaign his leader since the new leader's information is the same with the old deleted leader, and in this situtation and with only 1 PD, we shouldn't campaign again,related code https://github.com/tikv/pd/blob/master/server/member/member.go#L171?
[2020/12/29 12:42:51.242 +00:00] [ERROR] [member.go:176] ["deleting pd leader key meets error"] [error="[PD:etcd:ErrEtcdTxn]etcd Txn failed"]
[2020/12/29 12:42:51.443 +00:00] [WARN] [member.go:174] ["the pd leader has not changed, delete and campaign again"] [old-pd-leader="name:\"pd-10.0.2.36-2379\" member_id:14280389838364704182 peer_urls:\"http://10.0.2.36:2380\" client_urls:\"http://10.0.2.36:2379\" "]
[2020/12/29 12:42:51.444 +00:00] [ERROR] [member.go:176] ["deleting pd leader key meets error"] [error="[PD:etcd:ErrEtcdTxn]etcd Txn failed"]
I think the pd can not work may caused by this, that the pd couldn't campaign his leader since the new leader's information is the same with the old deleted leader, and in this situtation and with only 1 PD, we shouldn't campaign again,related code tikv/pd@
master
/server/member/member.go#L171?[2020/12/29 12:42:51.242 +00:00] [ERROR] [member.go:176] ["deleting pd leader key meets error"] [error="[PD:etcd:ErrEtcdTxn]etcd Txn failed"] [2020/12/29 12:42:51.443 +00:00] [WARN] [member.go:174] ["the pd leader has not changed, delete and campaign again"] [old-pd-leader="name:\"pd-10.0.2.36-2379\" member_id:14280389838364704182 peer_urls:\"http://10.0.2.36:2380\" client_urls:\"http://10.0.2.36:2379\" "] [2020/12/29 12:42:51.444 +00:00] [ERROR] [member.go:176] ["deleting pd leader key meets error"] [error="[PD:etcd:ErrEtcdTxn]etcd Txn failed"]
It is not the critical reason, forget it. sorry
Note: Make Sure that 'component', and 'severity' labels are added Example for how to fill out the template: https://github.com/pingcap/tidb/issues/20100
After https://github.com/tikv/pd/pull/3305, the id allocator will alloc duplicate id when restart, which let the regions have a hold. the key is located in the hold and will not found the corresponding region.
region have duplicated ids.
master after https://github.com/tikv/pd/pull/3305
master after https://github.com/tikv/pd/pull/3322
( FixedVersions AffectedVersions ) fields are empty.
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
pd.toml
tidb.toml
tikv.toml
2. What did you expect to see? (Required)
The cluster works well after the chaos recover.
3. What did you see instead (Required)
The cluster does not work anymore.
DDL hangs
dml reports PD server timeout
mysql> select * from v limit 1; ERROR 9001 (HY000): PD server timeout
root@ce61d5126d28:/disk1/deploy# ~/.tiup/components/ctl/v4.0.9/pd-ctl region check miss-peer | jq .count 512
root@ce61d5126d28:/disk1/deploy# /disk1/deploy/pd-2379/bin/pd-server -V Release Version: v4.0.0-rc.2-411-gb6d036bf Edition: Community Git Commit Hash: b6d036bfda6ff000d8072163284bc198175477cc Git Branch: master UTC Build Time: 2020-12-28 05:55:51
root@ce61d5126d28:/disk1/deploy# /disk1/deploy/tidb-4000/bin/tidb-server -V Release Version: v4.0.0-beta.2-1921-g5e67a597c Edition: Community Git Commit Hash: 5e67a597ccdd8220f40d69bc601f1b664949f885 Git Branch: master UTC Build Time: 2020-12-28 07:22:38 GoVersion: go1.13 Race Enabled: false TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306 Check Table Before Drop: false
root@ce61d5126d28:/disk1/deploy# /disk1/deploy/tikv-20160/bin/tikv-server -V TiKV Release Version: 4.1.0-alpha Edition: Community Git Commit Hash: bd5a30bb0356edde71972e0f33316be4ac1973de Git Commit Branch: master UTC Build Time: 2020-12-28 13:21:51 Rust Version: rustc 1.49.0-nightly (b1496c6e6 2020-10-18) Enable Features: jemalloc mem-profiling portable sse protobuf-codec test-engines-rocksdb Profile: dist_release