matrixorigin / matrixone

Hyperconverged cloud-edge native database
https://docs.matrixorigin.cn/en
Apache License 2.0
1.76k stars 274 forks source link

[Bug]: restore snapshot fail, msg: Error 20618 (HY000): r-w conflict #18782

Open PHK-20 opened 6 days ago

PHK-20 commented 6 days ago

Is there an existing issue for the same bug?

Branch Name

f01f07a2a

Commit ID

f01f07a2a

Other Environment Information

- Hardware parameters:
- OS type: qa
- Others:

Actual Behavior

restore snapshot fail snapshot_name : 0191e97b-81b1-7cf0-af25-db994498c017_1726216234854 instance_id : 0191e97b-81b1-7cf0-af25-db994498c017 ErrorMsg: Error 20618 (HY000): r-w conflict

Expected Behavior

No response

Steps to Reproduce

restore account `0191e97b-81b1-7cf0-af25-db994498c017` from snapshot `0191e97b-81b1-7cf0-af25-db994498c017_1726216234854 `

Additional information

No response

YANGGMM commented 6 days ago

https://grafana.cn-qa.matrixone.tech/explore?panes=%7B%22Qqa%22:%7B%22datasource%22:%22d28a3d73-a58a-4bd1-95bd-4ad0f7be2a4a%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bmatrixone_cloud_component%3D%5C%22cn%5C%22%7D%20%7C%3D%20%60conflict%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22d28a3d73-a58a-4bd1-95bd-4ad0f7be2a4a%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%22now-6h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1 {"level":"ERROR","time":"2024/09/13 08:34:38.479238 +0000","name":"cn-service.frontend","caller":"frontend/util.go:511","msg":"query trace status","uuid":"61383333-3930-6632-3764-366465616366","statement":"commit","status":"fail","error":"r-w conflict","txn_info":"0191ea82d8d47e24bb5acc284d88d13f:1726216477924590384-1","session_info":"connectionId 1972155737||{account sys:dump:moadmin -- 0:1:0}|goRoutineId 40160427|migrate-goRoutineId 0|0191ea82-d87b-729b-9e5f-3a8b3c3b531b","background":true,"session_id":"0191ea82-d8a4-73da-a993-fe53df39b6c9","span":{"trace_id":"dbf23b28-2aac-a00a-a3a0-6ffcbc10f1d2","span_id":"d11c0100db94fbc6"}}

XuPeng-SH commented 6 days ago

@triump2020

triump2020 commented 5 days ago

r-w 两种可能:

  1. restore 过程中产生了大量的deletes , 写了S3, 目前加了日志,还没复现。 但是resotre 过程中只会对mo_tables 有delete 操作,理论上应该不会发生deletes 写S3
  2. TN 收到事务提交后,在frezze 阶段做了transfer后,到prepare commit之间又有merge 事务提交了

需要通过日志 判断 到底是上面哪种情况

Notice : main 上已经做了相关优化,但1.2 没有.

triump2020 commented 21 hours ago

加了日志用于判断具体的原因,尚未复现.