wepay / waltz

Waltz is a quorum-based distributed write-ahead log for replicating transactions
https://wepay.github.io/waltz/
Apache License 2.0
411 stars 35 forks source link

Health check of Storesession fix #162

Closed Hans3q closed 2 years ago

Hans3q commented 2 years ago

Problem

The old method doesn't return the correct information if a Partition is closed. Because in the underlying methods, even if the StoreSessionManager is not currently running (i.e. if it was closed), the StoreSessionManager.isHealthy() method will returns true.
e.g. when the clusters were deleted, the healthy state remains true {"server-health-check":{"healthy":true,"zookeeper":true,"partitions":{"0":false,"1":false,"2":false,"3":false,"4":false}}}

Testing

Steps to locally verify functionality of this PR:

  1. Start a new cluster with multiple partitions (delete old waltz containers if needed): export WALTZ_TEST_CLUSTER_NUM_PARTITIONS=5 bin/test-cluster.sh start
  2. Stop and start server node couple of times (to increase generation value) ./bin/zookeeper-cli.sh list -c config/local-docker/waltz-tools.yml (command to check the generation number)
  3. Delete waltz cluster bin/zookeeper-cli.sh delete -n waltz_cluster --cli-config-path ./config/local-docker/waltz-tools.yml
  4. Create waltz cluster bin/zookeeper-cli.sh create -p 5 -n waltz_cluster --cli-config-path ./config/local-docker/waltz-tools.yml
  5. Stop and delete waltz store node docker container
  6. start store node ./bin/docker/waltz-storage.sh start waltz_cluster 55280