brave / sync

deprecated Brave sync server. (sync now uses a fork of the Chromium sync protocol.)
Mozilla Public License 2.0
204 stars 41 forks source link

Ignore records from SQS which where seen in S3 #306

Closed AlexeyBarabash closed 5 years ago

AlexeyBarabash commented 5 years ago

There is a situation when sync lib sends duplicating records, first from S3 and then from SQS, it may happens if syncing bookmarks early after establishing a sync chain:

[26515:26515:0426/170809.283612:INFO:CONSOLE(5175)] "listObjects (1) S3 content.timestamp=1556287688729", source: chrome-extension://nomlkjnggnifocmealianaaiobmebgil/extension/brave-sync/bundles/bundle.js (5175)
[26515:26515:0426/170809.286318:INFO:CONSOLE(5175)] "listObjects (1) S3 content.timestamp=1556287688737", source: chrome-extension://nomlkjnggnifocmealianaaiobmebgil/extension/brave-sync/bundles/bundle.js (5175)

[26515:26515:0426/170908.000107:INFO:CONSOLE(5116)] "SQS record timestamp=1556287706300", source: chrome-extension://nomlkjnggnifocmealianaaiobmebgil/extension/brave-sync/bundles/bundle.js (5116)
[26515:26515:0426/170908.288370:INFO:CONSOLE(5116)] "SQS record timestamp=1556287688729", source: chrome-extension://nomlkjnggnifocmealianaaiobmebgil/extension/brave-sync/bundles/bundle.js (5116)
[26515:26515:0426/170908.588513:INFO:CONSOLE(5116)] "SQS record timestamp=1556287688737", source: chrome-extension://nomlkjnggnifocmealianaaiobmebgil/extension/brave-sync/bundles/bundle.js (5116)

This breaks merging. On brave-browser this was workarounded with https://github.com/brave/brave-core/pull/2016 (Don't send bookmarks) , but this seems not enough.

This PR does ignores records from SQS which were seen as from S3 by unique timestampt. If these records are processed twice in between "legal" records, this may cause mess.

STR for issue using brave-syncer branch:

  1. device a create folder A and bookmark sync
  2. device b connect to it and move sync under A
  3. device a resolves records to 0, no changes reflected
SergeyZhukovsky commented 5 years ago

Could it be that 2 records have exactly the same timestamp? Maybe we should replace it to use an objectId instead?

diracdeltas commented 5 years ago

it would be good to have test coverage for this. https://github.com/brave/sync/blob/staging/test/s3Helper.js

AlexeyBarabash commented 5 years ago

Could it be that 2 records have exactly the same timestamp? Maybe we should replace it to use an objectId instead?

@SergeyZhukovsky timestamp is the most close field by purpose to unique identify the record. I hadn't seen the case when timestampts are exactly the same, but in theory that can rare happen.

objectId is not suitable for this because it identifies an object and not the operation.

I take timestampt from Key, which looks like Key=0/XVlVvDI3dlb7Dvk6CkMWRN2piY3I7CJrlPUta7seG/s=/0/1556309475075/0/NF5Ra/Cs8CNRKja74XBGnqcLSog1yVoCTwGbg6sxhqzb15jt1pEVeF7flavfdBqqBFn38AMtqyxD/xQks7yikLA2DLc7VwVe8z+wK8sJ9q0CCCIgO2SwV1kGfMFMH8lS22HX6Kc5QCnj5Js3S1BKcdgeX+uIa8GEUCGHErbQEDRqbqoy596j7uboLXMmCC/Yo80EurmqX5Ga5CxRM4PlIAUSUj39s3k3oy8YBWcWBb+hRF118kuao0b5sKnOg+fkIf68X5AZIIhozp7DHG1zoMypFy9YJZI3YSwmA/xp2cE0YmWFWOiAuJD+kOzFA4UyMVtEsnFpEyu8RhMpzowmCCDq8S0kztI+iq/IunpIF5itOkAlbXxeNREHgUWOgAlqTtRZWKmIwh8SwliJJwc+X2iFEFqGVm9FQP6/xAwIV8umOe0yc9c7E9DIwU+sTPySaVg54QCCYaGAAAExmUma+2EiKKNs6gvhO7GBz94JYAAA== And this is exactly what is unique, but it is too big to use it and store without erase policy.

AlexeyBarabash commented 5 years ago

it would be good to have test coverage for this. https://github.com/brave/sync/blob/staging/test/s3Helper.js

@diracdeltas thanks, looking how to make a testcase

AlexeyBarabash commented 5 years ago

closed this pr because there is a better way

AlexeyBarabash commented 5 years ago

I had removed commit https://github.com/brave/sync/pull/306/commits/a8fa6a4a4c9dc38aabef7ac40cabbe84d22d4d10 Ignore messages from ourself because clients use this info to ensure the record was successfully arrived to Amazon storage. But this should be filtered out on the client, client should not answer resolve-sync-records when record has a device_id equal to the device_id of the client.

AlexeyBarabash commented 5 years ago

@SergeyZhukovsky @diracdeltas @darkdh Can I get another review?

Fixed according to the previous notices: I use timeStamp+crc as a key; a testcase was added.