We have a minor loss in data with concurrent updates that can cause duplicate rows, so this adds a unique index on name and month and catches unique key errors.
This same problem does occur with updates when fuzzycount is low, but realistically this isn't a big concern as these are estimates, and the ones with the low fuzzycount will get updated quickly.
We don't want any sort of transactions or locking to ensure we do no harm - so best to ignore those cases.
For upgrade script logic:
I've kept the page group with the higher fuzzycount. In most cases this is lopsided, but even when they are close the highest is still a reasonable estimate.
I also had to change the precision of name to 255 to be able to add this as a unique index. We shouldn't have real page groups with names this long, but bad urls can exist so I've deleted those.
Closes #348
We have a minor loss in data with concurrent updates that can cause duplicate rows, so this adds a unique index on name and month and catches unique key errors.
This same problem does occur with updates when fuzzycount is low, but realistically this isn't a big concern as these are estimates, and the ones with the low fuzzycount will get updated quickly.
We don't want any sort of transactions or locking to ensure we do no harm - so best to ignore those cases.
For upgrade script logic: