wepay / waltz

Waltz is a quorum-based distributed write-ahead log for replicating transactions
https://wepay.github.io/waltz/
Apache License 2.0
411 stars 35 forks source link

StoreSession detect not enough replicas being available #158

Closed hrdlotom closed 2 years ago

hrdlotom commented 2 years ago

"Not enough replicas" exception condition checked against numReplicas instead quorum. Quorum is a non zero variable this.quorum = this.numReplicas / 2 + 1; and the condition doesn't get triggered when no partition is available as it is supposed to. As a result during recovery process a thread is stuck in a recovery completion process holding lock on StoreSessionManager object blocking another thread from adding replica after assign-partition Cli is called.

This happens during creation of a new cluster and deployment of server nodes first.

"Thread-2-Append-P0" #14 daemon prio=5 os_prio=0 tid=0x00007fa324012800 nid=0x2a in Object.wait() [0x00007fa36cf4c000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000a7d16a98> (a java.lang.Object)
    at java.lang.Object.wait(Object.java:502)
    at com.wepay.waltz.store.internal.RecoveryManagerImpl.awaitCompletion(RecoveryManagerImpl.java:430)
    - locked <0x00000000a7d16a98> (a java.lang.Object)
    at com.wepay.waltz.store.internal.RecoveryManagerImpl.highWaterMark(RecoveryManagerImpl.java:400)
    at com.wepay.waltz.store.internal.StoreSessionImpl.open(StoreSessionImpl.java:100)
    at com.wepay.waltz.store.internal.StoreSessionManager.createSession(StoreSessionManager.java:203)
    at com.wepay.waltz.store.internal.StoreSessionManager.getStoreSession(StoreSessionManager.java:144)
    - locked <0x00000000a76ad680> (a com.wepay.waltz.store.internal.StoreSessionManager)
    at com.wepay.waltz.store.internal.StorePartitionImpl.highWaterMark(StorePartitionImpl.java:175)
    at com.wepay.waltz.server.internal.Partition$AppendTask.init(Partition.java:543)
    at com.wepay.riff.util.RepeatingTask.lambda$new$0(RepeatingTask.java:20)
    at com.wepay.riff.util.RepeatingTask$$Lambda$11/1905485420.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:745)

"pool-2-thread-1" #28 daemon prio=5 os_prio=0 tid=0x00007fa324018000 nid=0x37 waiting for monitor entry [0x00007fa35c99a000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at com.wepay.waltz.store.internal.StoreSessionManager.getStoreSession(StoreSessionManager.java:132)
    - waiting to lock <0x00000000a76ad680> (a com.wepay.waltz.store.internal.StoreSessionManager)
    at com.wepay.waltz.store.internal.StoreImpl.getStoreSession(StoreImpl.java:170)
    at com.wepay.waltz.store.internal.StoreImpl.lambda$onReplicaAssignmentsUpdate$1(StoreImpl.java:152)
    at com.wepay.waltz.store.internal.StoreImpl$$Lambda$52/2053985767.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

The fix: if number of store replicas is 0, StoreException is thrown, store session is closed and remains closed until assign-partition is called, which opens the store session. Partition is kept open through out the whole time.

How to reproduce the error: comment everything after line echo "----- assigning partitions to the storage -----" in add-storage.sh create new cluster: bin/test-cluster.sh start add-partition: bin/storage-cli.sh add-partition -c config/local-docker/waltz_cluster/waltz-tools.yml -s localhost:55281 -p 0 assign-partition: bin/zookeeper-cli.sh assign-partition -c config/local-docker/waltz_cluster/waltz-tools.yml -s waltz_cluster_storage:55280 -p 0 see the error/fix working: bin/zookeeper-cli.sh list --cli-config-path ./config/local-docker/waltz-tools.yml - without the fix we see

store [/waltz_cluster/store/partition/0] replica states:
  No node found

instead of

store [/waltz_cluster/store/partition/0] replica states:
  ReplicaId(0,waltz_cluster_storage:55280), SessionId: 2, closingHighWaterMark: UNRESOLVED