Horusiath / Akka.Cluster.Discovery

Plugins for Akka.NET seed node management using 3rd party service discovery providers
Apache License 2.0
36 stars 18 forks source link

It takes Cluster Singleton 1 minute to move to another node #23

Open vasily-kirichenko opened 6 years ago

vasily-kirichenko commented 6 years ago
  1. Consul discovery, the settings are:
    akka.cluster {
    discovery {
    provider = akka.cluster.discovery.consul
    consul {
      listener-url = "http://127.0.0.1:8500"
      class = "Akka.Cluster.Discovery.Consul.ConsulDiscoveryService, Akka.Cluster.Discovery.Consul"
      dispatcher = "consul-dispatcher"
      alive-interval = 10s
      alive-timeout = 1m
      refresh-interval = 1m
      join-retries = 3
      lock-retry-interval = 250ms
      datacenter = "dc"
      token = ""
      wait-time = 30s
    }
    }               
    }
  2. Three nodes cluster, a singleton is running on a node.
  3. Kill the node on which the singleton is running.
  4. A new singleton is launched after ~1 minute delay, which is unacceptable, the docs promise that it should take few seconds at most.
Horusiath commented 6 years ago

Cluster singleton migration depends on the time of down node detection - if node is just unreachable, we cannot assume it's dead, since it may be just temporary network issue and we don't want to end with 2 singletons. Therefore we need to determine if a node is down:

Docs probably refer to time required to migrate, once a down node has been detected. In case of consul cluster discovery, you can play with alive-timeout and refresh-interval settings to try to lower that time frame. However if I'm right consul itself requires at least 30-60s to detect an unhealty node.