PowerDNS / weakforced

Anti-Abuse for servers at authentication time
GNU General Public License v3.0
124 stars 33 forks source link

Weakforced synchost should not download data if data structure differs [BUG] #267

Open sshipway opened 5 years ago

sshipway commented 5 years ago

Describe the bug On startup, the synchost feature can download the in-memory data from a sister node. This should not happen if the data has a different structure, such as if HLLBits is changed.

Package: wforce-2.0.1-2.x86_64

To Reproduce Steps to reproduce the behavior:

  1. Set up 2-node cluster with stringstatsDB that has an HLL field, and synchost definitions with minimal required uptime.
  2. Cause records to be added to the HLL stringstatsDB
  3. Modify one node to have setHLLBits(8)
  4. Restart that node.
  5. Attempt to have records updated in the HLL stringstatsDB

Expected behavior Since the stringstatsDB is now a different format, the restarting node should spot this, and not request the data from the existing node butinstead start with an empty database.

Actual behaviour Data is downloaded into memory from other node even though it does not match the expected format. When the data is actually accessed in operation, it produces an error.

{"status":"failure", "reason":"number of registers doesn't match: 256 != 64"}

OS:

Additional context This problem may also occur if new fields are added or other fields changed. Maybe have a version code created by a hash of the database definition checked as part of the sync handshake?

sshipway commented 5 years ago

Workaround - restart both nodes immediately. Since the uptime is less than the required, sync transfer does not happen and a fresh database is created.

neilcook commented 5 years ago

I would say this is a feature request rather than a bug report. Given that a multitude of things will fail if all the wforce instances have different configs, (like replication) then this really a bad idea. I might just use a hash based on the wforce.conf file.

neilcook commented 5 years ago

Just to stress, and I might make this more explicit in the docs: wforce clusters expect all cluster nodes to have to same config. I'm not going to spend a huge amount of time checking that people haven't followed that rule.

sshipway commented 5 years ago

I agree that all nodes should have the same config, and you don't want to waste a lot of time checking everything -- though some simple verification might help prevent mistakes.

A hash based on the entire config file might not work because there are some differences (synchost and siblings might not list themselves)

The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.

Maybe a better workaround is to recommend that the cluster key (setKey()) be changed at the same time as any changes to database structure? I haven't tested this, but would that result in a rolling update of nodes failing to connect nodes until the sibling was also updated?

neilcook commented 5 years ago

The situation affecting us was during a config upgrade. To roll out a new config to all hosts, it was done progressively to keep the service up, but the new synchost feature has caused this process to break. Something like a config db fingerprint comparison before download would make progressive changes less error prone.

Yes I will look into possible ways to fingerprint the StatsDBs configuration at the very least, as this is the most critical.

Neil