mKeRix / room-assistant

Presence tracking and more for automation on the room-level
https://www.room-assistant.io
MIT License
1.27k stars 122 forks source link

Incorrect leader elected #196

Open domoritz opened 4 years ago

domoritz commented 4 years ago

Describe the bug I have a cluster with two nodes and the wrong node gets elected as leader.

The main node is livingroom with weight 100. I also have a second node bedroom that should not be the leader. However, as you can see in the logs below, the bedroom got elected as the leader.

Relevant logs

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[20:29:44] INFO: Setting up Home Assistant configuration
[20:29:44] INFO: Starting room-assistant
*** WARNING *** The program 'node' uses the Apple Bonjour compatibility layer of Avahi.
*** WARNING *** Please fix your application to use the native API of Avahi!
*** WARNING *** For more information see <http://0pointer.de/blog/projects/avahi-compat.html>
*** WARNING *** The program 'node' called 'DNSServiceRegister()' which is not supported (or only supported partially) in the Apple Bonjour compatibility layer of Avahi.
*** WARNING *** Please fix your application to use the native API of Avahi!
*** WARNING *** For more information see <http://0pointer.de/blog/projects/avahi-compat.html>
5/11/2020, 8:29:44 PM - info - IntegrationsModule: Loading integrations: home-assistant, bluetooth-classic
5/11/2020, 8:29:44 PM - info - NestFactory: Starting Nest application...
5/11/2020, 8:29:44 PM - info - InstanceLoader: AppModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: ConfigModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: NestEmitterModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: IntegrationsModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: DiscoveryModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: HomeAssistantModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: ClusterModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: ScheduleModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: BluetoothClassicModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: EntitiesModule dependencies initialized
5/11/2020, 8:29:44 PM - info - InstanceLoader: StatusModule dependencies initialized
5/11/2020, 8:29:44 PM - info - RoutesResolver: EntitiesController {/entities}:
5/11/2020, 8:29:44 PM - info - RouterExplorer: Mapped {/, GET} route
5/11/2020, 8:29:44 PM - info - RoutesResolver: StatusController {/status}:
5/11/2020, 8:29:44 PM - info - RouterExplorer: Mapped {/, GET} route
5/11/2020, 8:29:44 PM - info - HomeAssistantService: Successfully connected to MQTT broker at mqtt://core-mosquitto:1883
5/11/2020, 8:29:44 PM - info - ConfigService: Loading configuration from /usr/lib/node_modules/room-assistant/dist/config/definitions/default.js, config/default.json, config/local.json
5/11/2020, 8:29:44 PM - info - ClusterService: Starting mDNS advertisements and discovery
5/11/2020, 8:29:44 PM - info - NestApplication: Nest application successfully started
5/11/2020, 8:29:45 PM - info - ClusterService: Added 192.168.0.???:6425 to the cluster with id bedroom
5/11/2020, 8:29:46 PM - info - EntitiesService: Refreshing entity states
5/11/2020, 8:30:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:30:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:32:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:36:50 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/11/2020, 8:51:13 PM - info - ClusterService: bedroom has been elected as leader
5/11/2020, 9:34:45 PM - info - HomeAssistantService: Device tracker requires manual setup in Home Assistant with topic: room-assistant/device_tracker/bluetooth-classic-??????????-tracker/state
5/12/2020, 1:35:30 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 2:28:01 AM - info - ClusterService: Removed 192.168.0.???:6425 from the cluster with id bedroom
5/12/2020, 2:28:04 AM - info - ClusterService: Added 192.168.0.???:6425 to the cluster with id bedroom
5/12/2020, 3:59:41 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 4:06:40 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 9:04:09 AM - info - ClusterService: bedroom has been elected as leader
5/12/2020, 9:39:30 AM - info - ClusterService: bedroom has been elected as leader

Relevant configuration Paste the relevant parts of your configuration below.

living room

global:
  instanceName: livingroom
  integrations:
    - homeAssistant
    - bluetoothClassic
  cluster:
    weight: 100
bluetoothClassic:
  interval: 60
  addresses:
    ??
    ??
    ??
    ??
    ??

bedroom

global:
  instanceName: bedroom
  integrations:
    - homeAssistant
    - bluetoothClassic
  cluster:
    quorum: 2
    weight: 1
homeAssistant:
  mqttUrl: 'mqtt://??????:1883'
  mqttOptions:
    username: mqtt
    password:?????????
bluetoothClassic:
  interval: 60
  addresses:
    ??
    ??
    ??
    ??
    ??

Expected behavior I would expect the livingroom node to be the leader.

Environment

mKeRix commented 4 years ago

The weights for the leader election are more guidelines than hard rules. When connecting instances together the leader is chosen by the following logic:

During an election each instance just submits a vote for the node that has the highest weight from the ones that it knows of locally. Applying this to your scenario, I suspect that your bedroom node was already running and elected itself as leader when livingroom connected. As a quick fix you can try to shutdown both instances, then start livingroom. Once that's done you can start bedroom. Both should now have livingroom as the leader.

domoritz commented 4 years ago

Ahh, thanks for the explanation. The issue for me is that the raspberry pi doesn't have the best connection (ssh is really slow) so I suspect that I'm not getting updates when it's the leader. I can force livingroom to be the leader by restarting the pi but my livingroom server also restarts from time to time (updates) so it would be nice if it didn't lose its leadership position because of that.

Would it make sense to change the propofol and elect a leader every time a node joins a cluster?

mKeRix commented 4 years ago

The issue with that would likely be random state changes, as an instance starts with an empty state. Once an instance is elected as leader it will force the entities to match its own local state - if the state hasn't regenerated to the right level yet on the instance you will see random blips of wrong states with the restarts.

Aside from that, if your instance reconnects within cluster.timeout the cluster should not change leaders.

domoritz commented 4 years ago

Couldn't there be some initialization protocol where a new node that becomes the new leader initializes its state before taking over as the leader?

The issue in my case is that the bedroom node has bad wifi and so I don't want it to be the leader.

I understand that my use case is maybe not the target use case so feel free to close this issue as wontfix but maybe my feedback is useful for future versions of room assistant.

mKeRix commented 4 years ago

There probably could be - and at the very least we should handle these kinda scenarios better. I'll keep this open for tracking. Maybe I can think of a good solution!

github-actions[bot] commented 3 years ago

There hasn't been any activity on this issue recently. In an effort to provide a better overview of current issues we automatically clean some of the old ones. Many of them may already be resolved in newer versions of room-assistant. This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

ghzgod commented 3 years ago

Please don't close this issue.

Nathan-Schwartz commented 3 years ago

I believe the enableStrictWeightMode option introduced in https://github.com/goldfire/democracy.js/pull/18 could resolve this issue