pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.52k stars 5.74k forks source link

tidb panic after injection network partition between two AZ #54335

Closed Lily2025 closed 19 hours ago

Lily2025 commented 3 days ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

1、run mussel workload 2、inject network partition between two AZ

2. What did you expect to see? (Required)

no panic

3. What did you see instead (Required)

tidb panic 2024-06-30 10:15:40 log="\n" 2024-06-30 10:15:40 log="/tidb-server --store=tikv --advertise-address=tc-tidb-1.tc-tidb-peer.endless-ha-test-airbnb-tps-7510461-1-362.svc --host=0.0.0.0 --path=tc-pd:2379 --config=/etc/tidb/tidb.toml\n" 2024-06-30 10:15:40 log="start tidb-server ...\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/cmd/tidb-server/main.go:905 +0x37c\n" 2024-06-30 10:15:37 log="created by main.createServer in goroutine 1\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/util/expensivequery/expensivequery.go:98 +0xaa8\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/util/expensivequery.(*Handle).Run(0xc003868408)\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/server.go:918 +0x515\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/server.(*Server).Kill(0xc003cd3b00, 0x1241bfee, 0x1, 0x0?)\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/server.go:950 +0x2ca\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/server.killQuery(0xc1a069e680, 0x1)\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/util/sqlkiller/sqlkiller.go:84\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/util/sqlkiller.(*SQLKiller).FinishResultSet(...)\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2064 +0x1c\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/server.(*clientConn).handleStmt.func1()\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/internal/resultset/resultset.go:69 +0x33\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/server/internal/resultset.(*tidbResultSet).Finish(0xa2b2e20?)\n" 2024-06-30 10:15:37 log="\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/session/session.go:2358 +0x14\n" 2024-06-30 10:15:37 log="github.com/pingcap/tidb/pkg/session.(*execStmtResult).Finish(0xc0032f59a0?)\n" 2024-06-30 10:15:37 log="goroutine 12862 [running]:\n" 2024-06-30 10:15:37 log="\n" 2024-06-30 10:15:37 log="[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x540d174]\n" 2024-06-30 10:15:37 log="panic: runtime error: invalid memory address or nil pointer dereference\n" 2024-06-30 10:15:37 log="[conn.go:1162] [\"command dispatched failed\"] [conn=306298908] [session_alias=] [connInfo=\"id:306298908, addr:10.233.108.10:51888 status:10, collation:utf8mb4_general_ci, user:root\"] [command=Query] [status=\"inTxn:0, autocommit:1\"] [sql=\"select /*+ max_execution_time(400), set_var(tikv_client_read_timeout=100) */ pk, sk, ts, v from t1 as of timestamp now() - interval 10 second where pk = '151832271' and sk = 'y4_16' and ts >= '2024-06-30 02:14:36.647382' and ts < '2024-06-30 02:15:26.647382' order by ts desc limit 5\"] [txn_mode=PESSIMISTIC] [timestamp=450812634988544000] [err=\"context canceled\\ngithub.com/pingcap/errors.AddStack\\n\\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/errors.go:178\\ngithub.com/pingcap/errors.Trace\\n\\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/juju_adaptor.go:15\\ngithub.com/pingcap/tidb/pkg/store/copr.(*copIterator).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:1095\\ngithub.com/pingcap/tidb/pkg/distsql.(*selectResult).fetchResp\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/distsql/select_result.go:318\\ngithub.com/pingcap/tidb/pkg/distsql.(*selectResult).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/distsql/select_result.go:384\\ngithub.com/pingcap/tidb/pkg/executor.(*tableResultHandler).nextChunk\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:607\\ngithub.com/pingcap/tidb/pkg/executor.(*TableReaderExecutor).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:330\\ngithub.com/pingcap/tidb/pkg/executor/internal/exec.Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:410\\ngithub.com/pingcap/tidb/pkg/executor.(*LimitExec).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/executor.go:1366\\ngithub.com/pingcap/tidb/pkg/executor/internal/exec.Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:410\\ngithub.com/pingcap/tidb/pkg/executor.(*ExecStmt).next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/adapter.go:1250\\ngithub.com/pingcap/tidb/pkg/executor.(*recordSet).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/adapter.go:175\\ngithub.com/pingcap/tidb/pkg/server/internal/resultset.(*tidbResultSet).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/internal/resultset/resultset.go:64\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).writeChunks\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2332\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).writeResultSet\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2275\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).handleStmt\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2068\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).handleQuery\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:1785\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).dispatch\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:1359\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).Run\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:1125\\ngithub.com/pingcap/tidb/pkg/server.(*Server).onConn\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/server.go:739\\nruntime.goexit\\n\\t/usr/local/go/src/runtime/asm_amd64.s:1650\"]" 2024-06-30 10:15:37 log="[conn.go:1162] [\"command dispatched failed\"] [conn=306298862] [session_alias=] [connInfo=\"id:306298862, addr:10.233.119.153:60770 status:10, collation:utf8mb4_general_ci, user:root\"] [command=Query] [status=\"inTxn:0, autocommit:1\"] [sql=\"select /*+ max_execution_time(400), set_var(tikv_client_read_timeout=100) */ pk, sk, ts, v from t1 as of timestamp now() - interval 10 second where pk = '151832271' and sk = 'y4_16' and ts >= '2024-06-30 02:14:36.648815' and ts < '2024-06-30 02:15:26.648815' order by ts desc limit 5\"] [txn_mode=PESSIMISTIC] [timestamp=450812634988544000] [err=\"context canceled\\ngithub.com/pingcap/errors.AddStack\\n\\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/errors.go:178\\ngithub.com/pingcap/errors.Trace\\n\\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/juju_adaptor.go:15\\ngithub.com/pingcap/tidb/pkg/store/copr.(*copIterator).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/store/copr/coprocessor.go:1095\\ngithub.com/pingcap/tidb/pkg/distsql.(*selectResult).fetchResp\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/distsql/select_result.go:318\\ngithub.com/pingcap/tidb/pkg/distsql.(*selectResult).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/distsql/select_result.go:384\\ngithub.com/pingcap/tidb/pkg/executor.(*tableResultHandler).nextChunk\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:607\\ngithub.com/pingcap/tidb/pkg/executor.(*TableReaderExecutor).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/table_reader.go:330\\ngithub.com/pingcap/tidb/pkg/executor/internal/exec.Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:410\\ngithub.com/pingcap/tidb/pkg/executor.(*LimitExec).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/executor.go:1366\\ngithub.com/pingcap/tidb/pkg/executor/internal/exec.Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/internal/exec/executor.go:410\\ngithub.com/pingcap/tidb/pkg/executor.(*ExecStmt).next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/adapter.go:1250\\ngithub.com/pingcap/tidb/pkg/executor.(*recordSet).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/executor/adapter.go:175\\ngithub.com/pingcap/tidb/pkg/server/internal/resultset.(*tidbResultSet).Next\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/internal/resultset/resultset.go:64\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).writeChunks\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2332\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).writeResultSet\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2275\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).handleStmt\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2068\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).handleQuery\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:1785\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).dispatch\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:1359\\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).Run\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:1125\\ngithub.com/pingcap/tidb/pkg/server.(*Server).onConn\\n\\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/server.go:739\\nruntime.goexit\\n\\t/usr/local/go/src/runtime/asm_amd64.s:1650\"]" 2024-06-30 10:15:37 log="[sqlkiller.go:112] [\"kill finished\"] [conn=306298908]"

4. What is your TiDB version? (Required)

./tidb-server -V Release Version: v8.2.0-alpha Edition: Community Git Commit Hash: 7df4f66324905dfe2bf9e0f5288a0b2ce089098c Git Branch: heads/refs/tags/v8.2.0-alpha UTC Build Time: 2024-06-29 11:47:18 GoVersion: go1.21.10 Race Enabled: false Check Table Before Drop: false Store: unistore 2024-06-30T09:57:02.114+0800

Lily2025 commented 3 days ago

/severity major /assign zyguan

zyguan commented 2 days ago

Might be related to this PR, it's possible to call tidbResultSet.Finish after the trs has been closed (trs.recordSet is nil after close). @wshwsh12 Could you PTAL.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x540d174]

goroutine 12862 [running]:
github.com/pingcap/tidb/pkg/session.(*execStmtResult).Finish(0xc0032f59a0?)
        /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/session/session.go:2358 +0x14
github.com/pingcap/tidb/pkg/server/internal/resultset.(*tidbResultSet).Finish(0xa2b2e20?)
        /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/internal/resultset/resultset.go:69 +0x33
github.com/pingcap/tidb/pkg/server.(*clientConn).handleStmt.func1()
        /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/server/conn.go:2064 +0x1c
github.com/pingcap/tidb/pkg/util/sqlkiller.(*SQLKiller).FinishResultSet(...)
        /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/util/sqlkiller/sqlkiller.go:84
github.com/pingcap/tidb/pkg/server.killQuery(0xc1a069e680, 0x1)