Open yrliou opened 4 years ago
PR reverted in https://github.com/brave/go-sync/pull/23
As pointed out in the reverted PR, we cannot simply delete them directly because it would break users who had a device that have been offline for more than 90 days, when that device is back online, it cannot receive updates for items that have been deleted for more than 90 days, so this device will get conflict that it cannot resolve when it tries to change any of those items and keep backoff. It could be reset by leave and rejoin sync chain on this device, but the deleted items in this local will be committed back to server and sync to other devices which might not be what user would want. We need to revisit this topic with some more advanced solutions like putting data into S3 when ttl expires and include data objects from S3 too when client is asking for updates more than 90 days ago.
Good news here is when client committing a delete, they will remove specifics ~(expect bookmark now is an exception which still send a full specific)~, so the size of deleted records are not that bad, only their meta data are left in the storage.
Another thought is that we can always set TTL when creating/updating the item, and put them into cold storage when that expired. And of course we need to be able to get those items back when needed by our server.
We could also consider saving specifics binary in S3 instead of directly in Dynamo too.
To save storage cost, we are going to set the TTL attribute to 90 days away when we're soft-deleting an item so it will be automatically deleted by dynamoDB without extra cost. cc @jsecretan
ref: https://aws.amazon.com/blogs/aws/new-manage-dynamodb-items-using-time-to-live-ttl/