Open sshipway opened 5 years ago
Workaround - restart both nodes immediately. Since the uptime is less than the required, sync transfer does not happen and a fresh database is created.
I would say this is a feature request rather than a bug report. Given that a multitude of things will fail if all the wforce instances have different configs, (like replication) then this really a bad idea. I might just use a hash based on the wforce.conf file.
Just to stress, and I might make this more explicit in the docs: wforce clusters expect all cluster nodes to have to same config. I'm not going to spend a huge amount of time checking that people haven't followed that rule.
I agree that all nodes should have the same config, and you don't want to waste a lot of time checking everything -- though some simple verification might help prevent mistakes.
A hash based on the entire config file might not work because there are some differences (synchost and siblings might not list themselves)
The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.
Maybe a better workaround is to recommend that the cluster key (setKey()) be changed at the same time as any changes to database structure? I haven't tested this, but would that result in a rolling update of nodes failing to connect nodes until the sibling was also updated?
The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.
Yes I will look into possible ways to fingerprint the StatsDBs configuration at the very least, as this is the most critical.
Neil
Describe the bug On startup, the synchost feature can download the in-memory data from a sister node. This should not happen if the data has a different structure, such as if HLLBits is changed.
Package: wforce-2.0.1-2.x86_64
To Reproduce Steps to reproduce the behavior:
Expected behavior Since the stringstatsDB is now a different format, the restarting node should spot this, and not request the data from the existing node butinstead start with an empty database.
Actual behaviour Data is downloaded into memory from other node even though it does not match the expected format. When the data is actually accessed in operation, it produces an error.
{"status":"failure", "reason":"number of registers doesn't match: 256 != 64"}
OS:
Additional context This problem may also occur if new fields are added or other fields changed. Maybe have a version code created by a hash of the database definition checked as part of the sync handshake?