tikv / migration

Migration tools for TiKV, e.g. online bulk load.
Apache License 2.0
36 stars 26 forks source link

integration-tests: G10 (sink_hang + sigstop) run more than 25 minutes #334

Closed pingyu closed 1 year ago

pingyu commented 1 year ago

Bug Report

1. Describe the bug

Integration tests G10 (sink_hang + sigstop) run more than 25 minutes

See: https://do.pingcap.net/jenkins/blue/organizations/jenkins/tikv%2Fmigration%2Fpull_integration_test/detail/pull_integration_test/75/pipeline/292/ https://do.pingcap.net/jenkins/blue/rest/organizations/jenkins/pipelines/tikv/pipelines/migration/pipelines/pull_integration_test/runs/75/nodes/292/steps/370/log/?start=0 G10-log.txt

2. Minimal reproduce step (Required)

Run CI in PR.

3. What did you see instead (Required)

CI success.

4. What did you expect to see? (Required)

CI failed.

5. What is your migration tool and TiKV version? (Required)

pingyu commented 1 year ago

The logs show that after kill -19 pd-server here, ETCD can not recover and block the pd-ctl health.

{"level":"warn","ts":"2023-02-09T07:56:24.178+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:56:28.179+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout"}
{"level":"warn","ts":"2023-02-09T07:56:30.179+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":1,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:56:34.176+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"info","ts":"2023-02-09T07:56:34.176+0800","logger":"etcd-client","caller":"v3@v3.5.2/client.go:207","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:56:36.179+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:56:42.181+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:56:48.182+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59284->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:56:54.182+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59284->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:00.183+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59284->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:06.184+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59284->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:09.177+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59284->127.0.0.1:2379: use of closed network connection\""}
{"level":"info","ts":"2023-02-09T07:57:09.177+0800","logger":"etcd-client","caller":"v3@v3.5.2/client.go:207","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:57:12.185+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59582->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:18.185+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59582->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:24.187+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59582->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:30.188+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59582->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:36.189+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59878->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:42.190+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59878->127.0.0.1:2379: use of closed network connection\""}
{"level":"warn","ts":"2023-02-09T07:57:44.177+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59878->127.0.0.1:2379: use of closed network connection\""}
{"level":"info","ts":"2023-02-09T07:57:44.177+0800","logger":"etcd-client","caller":"v3@v3.5.2/client.go:207","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
{"level":"warn","ts":"2023-02-09T07:57:48.192+0800","logger":"etcd-client","caller":"v3@v3.5.2/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000958a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"error reading server preface: read tcp 127.0.0.1:59878->127.0.0.1:2379: use of closed network connection\""}

As it's not an issue of TiKV-CDC, I think we can just skip the check_count 2 "pd".