lni / dragonboat

A feature complete and high performance multi-group Raft library in Go.
Apache License 2.0
4.98k stars 533 forks source link

what's the limit of the shard number #304

Closed wh-afra closed 1 year ago

wh-afra commented 1 year ago

I have two questions: 1) How many clusters can dragonboat support ? 2) How can i add or remove the clusters/shards dynamically ( that means ,the cluster has been boottrapped )? I tried the latest version with ondisk module.

Thanks for your suggestions.

lni commented 1 year ago
  1. it is limited by the RAM/CPU/network bandwidth resources on your host machine. I typical use a few hundred, also tried a few thousand on some beefy servers, all works fine for me.

  2. if by removing shards you mean to get rid of the shard permanently, please have a look at the SyncRemoveData API and its godocs. Once called & completed successfully, it will have all data relate to the specified replica removed from the calling machine, if you do this for all replicas, you have the shard removed.

wh-afra commented 1 year ago

Thanks for your reply. In my contidion ,the machine may be added on demand. (e.g some machine will be replaced and so on) . I tried this scenario and some problem confused me . The following is the steps: 1) inital members is set {A,B,C} , the intial shards is set {1,2,3} ,StartOnDiskReplica is called and all is ok. 2) I want to add another machine D to the cluster. the SyncRequestAddReplica is called and everything looks well. 3) I add the new machine D to the inital memberlist ,now the initial memberlist is {A,B,C,D} and restart all the nodehost.StartOnDiskReplica ,the error occurred . "panic: shard settings are invalid" . It seems that i can not modify the inital member list even i add some machine . And , i tried to unchange the initial member list ,and all is ok .

I can not understand it .It looks like that the call of SyncRequestAddReplica is not enough ,maybe some data of logDB must be stored?

ps: on the machine A,B,C ,when i called StartOnDiskReplica ,i use the param : join=false,initalmemberlist ={A,B,C} on the machine D ,the key params are : join=true ,initialmemberlist ={} Can you give me some points? Thank you . Look forward to your reply .

wh-afra commented 1 year ago

@lni could you help me for some advice? thank you .

lni commented 1 year ago

as suggested by its name, initial memberlist is a list of replicas that initially formed the shard when it got created. D is not a part of that.

please have a look at the example, link is in the readme file, there is a concrete example on how to add new replicas.

wh-afra commented 1 year ago

@lni thanks for your reply. Yes ,i have joined the new node successfully .However,if i want to add the new cluster ,i found that the initial nodes can synchronous data normally ,while the joined node can not. Therefore ,in my situation ,the nodes and clusters may both be changed (add or remove) after the cluster set up .Could the dragonboat support it ? Maybe i ignore something import? Thank you very much.