lni / dragonboat

A feature complete and high performance multi-group Raft library in Go.
Apache License 2.0
5.08k stars 541 forks source link

some problem between raft groups #104

Closed caiwk closed 5 years ago

caiwk commented 5 years ago

I create 40 raft groups which handle write operation,a rocksdb raft group which record each write operation

after each write operation, every write-raft-group will call rocksdb.DoOp() in its *OnDiskStateMachine.Update() method to record this op.

package rocksdb
func DoOp(nh *dragonboat.NodeHost, cluster uint64, op op, kv KVData) interface{} {
    cs := nh.GetNoOPSession(cluster)
    ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
    defer cancel()
    if op == Put {
        data, err := json.Marshal(kv)
        if err != nil {
            panic(err)
        }
        _, err = nh.SyncPropose(ctx, cs, data)
        if err != nil {
            _, _ = fmt.Fprintf(os.Stderr, "SyncPropose returned error %v\n", err)
        }
    } 
}

my code about rocksdb just same as dragonboat-example's ondisk.go

the problem is

every write-raft-group blocked in DoOp _, err = nh.SyncPropose(ctx, cs, data) after 3s ,it timesout ,I try to set timeout to 30s ,it still return after 30s.

I found that diskKV(SM).Update() got nothing and never been called,after calling nh.SyncPropose(ctx, cs, data)

It's strange that if I set numbers of write-raft-group from 40 to 10, everything goes well !!!

I create a test project for this problem ,you can see details in details,it almost the same as you examples ,I think it wouldn't take too much time for you .

lni commented 5 years ago

You are not allowed to make a SyncPropose inside your SM's Update() method - to allow SyncPropose to complete, the SM has to invoke its Update method in its worker goroutine. In your program, the SM can not have its Update method invoked because your call to SyncPropose is blocking the previous call to Update to be completed.

lni commented 5 years ago

@caiwk I've also updated #103 and waiting for your responses. Please feel free to re-open #103 when you are ready to provide more info on that issue.

caiwk commented 5 years ago

although there are two different SM,one SM.update() still can not call to another SM's Syncpropose?why? I'm confused about it. If it's true , why it wouldn't block if I change the number of write-raft-group from 40 to 10?

lni commented 5 years ago

each worker goroutine used for executing SM's Update method is assigned to handle multiple raft groups.

caiwk commented 5 years ago

thx