pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.2k stars 5.84k forks source link

tiflash schrodinger test failed by `[tikv:1205]Lock wait timeout exceeded`. #28040

Closed LittleFall closed 2 years ago

LittleFall commented 3 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. scene

https://ci.pingcap.net/blue/organizations/jenkins/tiflash_schrodinger_test/detail/tiflash_schrodinger_test/1097/pipeline

log: http://fileserver.pingcap.net/download/logs/pingcap/tiflash/tiflash-jenkins-test-log-tiflash_schrodinger_test-1097.tar.gz

2. What did you expect to see? (Required)

test passed.

3. What did you see instead (Required)

bank.log

[case-id: ] 2021/09/14 06:58:56 bank.go:315: ESC[0;31m[error] [bank] accounts verify failed, query uuid is %!s(int=1000000000)%!(EXTRA string=30723232-e250-40c2-b26a-fab939f8f25a)ESC[0m

tidb.log

[2021/09/14 06:58:50.713 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] [conn=387] ["cost time"=213.331336ms]
[2021/09/14 06:58:53.007 +00:00] [INFO] [2pc.go:951] ["send TxnHeartBeat"] [startTS=427714816811466762] [newTTL=178881]
[2021/09/14 06:58:53.666 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] [conn=497] ["cost time"=317.758671ms]
[2021/09/14 06:58:53.777 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] ["cost time"=81.58648ms]
[2021/09/14 06:58:54.139 +00:00] [WARN] [session.go:1384] ["run statement failed"] [conn=6] [schemaVersion=28] [error="[tikv:1205]Lock wait timeout exceeded; try restarting transaction"] [session="{\n  \"currDBName\": \"test\",\n  \"id\": 6,\n  \"status\": 3,\n  \"strictMode\": true,\n  \"txn\": \"427714852306550816\",\n  \"user\": {\n    \"Username\": \"root\",\n    \"Hostname\": \"127.0.0.1\",\n    \"CurrentUser\": false,\n    \"AuthUsername\": \"root\",\n    \"AuthHostname\": \"%\"\n  }\n}"]
[2021/09/14 06:58:54.748 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] [conn=492] ["cost time"=166.882143ms]
[2021/09/14 06:58:55.467 +00:00] [INFO] [conn.go:864] ["command dispatched failed"] [conn=397] [connInfo="id:397, addr:127.0.0.1:58215 status:11, collation:utf8_general_ci, user:root"] [command=Query] [status="inTxn:1, autocommit:1"] [sql="set @@session.tidb_isolation_read_engines='tiflash'; select sum(balance) as total, '30723232-e250-40c2-b26a-fab939f8f25a' from accounts"] [txn_mode=PESSIMISTIC] [err="[tikv:2]get region failed: 2 : rpc error: code = Unavailable desc = not leader"]
[2021/09/14 06:58:55.957 +00:00] [INFO] [conn.go:864] ["command dispatched failed"] [conn=6] [connInfo="id:6, addr:127.0.0.1:50068 status:11, collation:utf8_general_ci, user:root"] [command=Query] [status="inTxn:1, autocommit:1"] [sql="set @@session.tidb_isolation_read_engines='tikv';SELECT id, balance FROM accounts WHERE id IN (808043, 299020) FOR UPDATE"] [txn_mode=PESSIMISTIC] [err="[tikv:1205]Lock wait timeout exceeded; try restarting transaction\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/juju_adaptor.go:15\ngithub.com/pingcap/tidb/store/tikv.actionPessimisticLock.handleSingleBatch\n\t/home/jenkins/agent/workspace/ld_tidb_multi_branch_release-4.0/go/src/github.com/pingcap/tidb/store/tikv/2pc.go:1087\ngithub.com/pingcap/tidb/store/tikv.(*batchExecutor).startWorker.func1\n\t/home/jenkins/agent/workspace/ld_tidb_multi_branch_release-4.0/go/src/github.com/pingcap/tidb/store/tikv/2pc.go:1868\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"]
[2021/09/14 06:58:56.116 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] [conn=361] ["cost time"=158.755305ms]
[2021/09/14 06:58:56.118 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] [conn=494] ["cost time"=160.018724ms]
[2021/09/14 06:58:56.118 +00:00] [WARN] [pd.go:109] ["get timestamp too slow"] ["cost time"=160.350345ms]

4. What is your TiDB version? (Required)

v4.0.15-pre

LittleFall commented 3 years ago

there are many tidb_slow.log in this test, maybe we should reduce the pressure of test environment and retry.

solotzg commented 3 years ago

https://github.com/pingcap/tiflash/pull/1772 We should add more CPU/MEM resources for schrodinger-test & daily-test.

solotzg commented 3 years ago

However, we have no idea whether IO resource is enough.

XuHuaiyu commented 2 years ago

not reproducible