zettadb / cluster_mgr

Clust_mgr is an important compnent of KunlunBase. It provides a HTTP API for KunlunBase users to do cluster management, provisioning and monitor work, so that uses can install a cluster, a kunlun-server node, a storage shard or a kunlun-storage node by calling such APIs. Such capability enables users to integrate KunlunBase management and provisioning as part of their existing application or GUIs. Cluster_mgr also provide other important cluster maintenance background work to make sure the KunlunBase clusters it serves can work efficiently and reliably.
http://www.kunlunbase.com
Apache License 2.0
10 stars 2 forks source link

cluster manager对分布式事务清理前,需要先写commit日志 #9

Open jd-zhang opened 2 years ago

jd-zhang commented 2 years ago

Issue migrated from trac ticket # 533

component: cluster manager | priority: major

2022-03-31 11:32:38: smith created the issue


当前分布式事务的两阶段提交逻辑:

1、计算节点向参与者shards发送xa prepare 'xxx';
2、计算节点写commit log
3、步骤2成功,则计算节点发送xa commit 'xxx'提交事物;

如果执行到步骤2时,计算节点已经故障重启,则shards上会存在残留的处于prepare的事物。当前的设计中,cluster manager负责定期清理这些残留的事务。

可能存在如下场景:

1、计算节点向参与者1、2发送xa prepare,其中参与者1很快返回,而参与者2需要很长时间。
2、计算节点等待参与者2的响应时,参与者1与计算节点连接断开(可能是idle时间过长,或者重启等)
3、cluster manager观察到参与者1上的残留的处于prepare状态的事务,并进行了回滚
4、计算节点终于等到了参与者2的响应,接着写commit log。
5、计算节点向参与者发送XA commit时才发现参与者1的连接已经断开

以上场景会造成数据的不一致:

(1)commit log中的决策和cluster manager的不一致
(2)参与者1和参与者2的不一致(一个提交一个回归)

类似的问题还可能发生于计算节点hung住的场景。

为了避免Cluster manager和计算节点对分布式事务决策的不一致。Cluster manager也应该向commit log写入决策记录,通过主键的唯一性来确保对同一个分布式事务有且只有一个决策。

jd-zhang commented 2 years ago

2022-04-18 19:24:13: zhangjindong@zettadb.com

jd-zhang commented 2 years ago

2022-04-18 19:24:13: zhangjindong@zettadb.com changed owner from all to barney