tikv / pd

Placement driver for TiKV
Apache License 2.0
1.04k stars 717 forks source link

pd panic during br restore data and run workload #7133

Closed Lily2025 closed 11 months ago

Lily2025 commented 11 months ago

Bug Report

What did you do?

1、export AWS_ACCESS_KEY_ID=minioadmin;export AWS_SECRET_ACCESS_KEY=minioadmin;/br restore db --send-credentials-to-tikv=true --db tpcc --pd http://tc-pd.endless-ha-test-htap-tps-2730651-1-24:2379 --storage s3://nfs/tiflash/ch-2k-flash2-collation --s3.endpoint http://minio.pingcap.net:9000 --check-requirements=false 2、go-tpc ch run -D tpcc --host tc-tidb.endless-ha-test-htap-tps-2730651-1-24 -P4000 --warehouses 2000 -T 32 --acThreads 1 --queries q23 --ignore-error '2013,1213,1105,1205,8022,8028,9004,9007,1062' --time 36000m --user root --password '' --interval '10s'

What did you expect to see?

no panic

What did you see instead?

pd leader panic multiple times "panic: runtime error: invalid memory address or nil pointer dereference\n" {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976312334Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/grpc-ecosystem/go-grpc-prometheus.init.(*ServerMetrics).UnaryServerInterceptor.func3({0x39cac18, 0xc00916edb0}, {0x2d2aea0, 0xc00b10a800}, 0xc009dd29c0?, 0xc004890050)\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976308617Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.0.1-0.20190118093823-f849b5445de4/chain.go:31 +0x7a\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976305646Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"go.etcd.io/etcd/etcdserver/api/v3rpc.Server.ChainUnaryServer.func5.1({0x39cac18?, 0xc00916edb0?}, {0x2d2aea0?, 0xc00b10a800?})\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976295715Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/go/pkg/mod/github.com/pingcap/kvproto@v0.0.0-20230920042517-db656f45023b/pkg/pdpb/pdpb.pb.go:9910 +0x75\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976292682Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/pingcap/kvproto/pkg/pdpb._PD_ReportBatchSplit_Handler.func1({0x39cac18, 0xc00916edb0}, {0x2d2aea0?, 0xc00b10a800})\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976289825Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/server/grpc_service.go:1734 +0x105\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976286823Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/tikv/pd/server.(*GrpcServer).ReportBatchSplit(0xc001e0e140, {0x39cac18?, 0xc00916edb0?}, 0xc00b10a800)\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976283913Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/server/cluster/cluster_worker.go:267 +0x85\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976280935Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/tikv/pd/server/cluster.(*RaftCluster).HandleBatchReportSplit(0xc001e0e140?, 0x39cac18?)\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976277921Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/server/cluster/cluster_worker.go:88 +0x665\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.97627468Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/tikv/pd/server/cluster.(*RaftCluster).ProcessRegionSplit(0xc000292500, {0xc00b395cb0?, 0x2, 0xc00b395cb0?})\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.9762714Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/server/cluster/cluster.go:1148 +0x85\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976268454Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/tikv/pd/server/cluster.(*RaftCluster).SaveRegion(0xc000292500, 0xc00675e400, 0xc00421f384)\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976265566Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/pkg/core/region.go:905 +0x254\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976262515Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/tikv/pd/pkg/core.(*RegionsInfo).AtomicCheckAndPutRegion(0xc0003b4a80, 0xc00675e400)\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976258991Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/pkg/core/region.go:1012 +0x14a\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976255623Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"github.com/tikv/pd/pkg/core.(*RegionsInfo).UpdateSubTree(0xc0003b4a80, 0xc00675e400, 0xc001471e00, {0x0, 0x0, 0x0?}, 0x0)\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976252344Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"goroutine 434887 [running]:\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976248772Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"\n","stream":"stderr"} {"container":"pd","pod":"tc-pd-0","time":"2023-09-21T22:59:09.976244024Z","namespace":"endless-ha-test-htap-tps-2730651-1-24","log":"[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x177eb4a]\n","stream":"stderr"}

What version of PD are you using (pd-server -V)?

./pd-server -V Release Version: v7.4.0-alpha Edition: Community Git Commit Hash: 96ace89decdc0b5e0a050aa17ba4356057ec3b88 Git Branch: heads/refs/tags/v7.4.0-alpha UTC Build Time: 2023-09-21 11:36:23 2023-09-22T00:06:12.873+0800

Lily2025 commented 11 months ago

/severity critical