tikv / pd

Placement driver for TiKV
Apache License 2.0
1.04k stars 718 forks source link

redirect but server is not leader? #4573

Open lddlww opened 2 years ago

lddlww commented 2 years ago

use tiup upgrade tidb cluster v4.0.1 to v5.3.0 failed, pd server throw errors,like follow:

[2022/01/12 16:16:30.727 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.226-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:16:30.732 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.227-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:16:35.714 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.226-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:16:35.720 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.227-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:00.361 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.226-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:00.364 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.226-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:00.368 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.227-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:00.371 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.227-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:00.394 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.226-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:00.397 +08:00] [ERROR] [middleware.go:104] ["redirect but server is not leader"] [from=pd-10.59.111.227-2379] [server=pd-10.59.111.225-2379] [error="[PD:apiutil:ErrRedirect]redirect failed"]
[2022/01/12 16:19:15.850 +08:00] [ERROR] [server.go:1275] ["failed to create raft cluster"] [error="context canceled"]

then i use tiup restart pd servers,but pd server Status was always Down

what went wrong?

rleungx commented 2 years ago

Can you use show the member output by using pd-ctl?

lddlww commented 2 years ago

no,it returned 502 error

lddlww commented 2 years ago

this is detail logs in pd server:datetime from 2022/1/12 15:50 to 2022/1/12 16:20 logs.zip

rleungx commented 2 years ago

Does the PD work well before you upgrade the cluster? And can you provide more logs?

lddlww commented 2 years ago

yes,all nodes of pd server were Up,datetime of the follow logs is from 2022/01/12 00:00:00 to 2022/01/12 16:30:00,may be a little more logs.zip

seems tiup threw evit leader timeout error when upgraded pd servers

rleungx commented 2 years ago

When did you start upgrade process, I didn't find any restart process log or transfer leader operation.

lddlww commented 2 years ago

about 2022/01/12 15:56 image

rleungx commented 2 years ago

Is that your log is warning level?

lddlww commented 2 years ago

oh, yes log level is warn

lddlww commented 2 years ago

does warn is default log level in 4.0.2?

rleungx commented 2 years ago

No, the default is info