brave / go-sync

Brave sync server v2
Mozilla Public License 2.0
181 stars 44 forks source link

Introduce cold storage for saving storage costs #21

Open yrliou opened 4 years ago

yrliou commented 4 years ago

To save storage cost, we are going to set the TTL attribute to 90 days away when we're soft-deleting an item so it will be automatically deleted by dynamoDB without extra cost. cc @jsecretan

ref: https://aws.amazon.com/blogs/aws/new-manage-dynamodb-items-using-time-to-live-ttl/

yrliou commented 4 years ago

PR reverted in https://github.com/brave/go-sync/pull/23

yrliou commented 4 years ago

As pointed out in the reverted PR, we cannot simply delete them directly because it would break users who had a device that have been offline for more than 90 days, when that device is back online, it cannot receive updates for items that have been deleted for more than 90 days, so this device will get conflict that it cannot resolve when it tries to change any of those items and keep backoff. It could be reset by leave and rejoin sync chain on this device, but the deleted items in this local will be committed back to server and sync to other devices which might not be what user would want. We need to revisit this topic with some more advanced solutions like putting data into S3 when ttl expires and include data objects from S3 too when client is asking for updates more than 90 days ago.

Good news here is when client committing a delete, they will remove specifics ~(expect bookmark now is an exception which still send a full specific)~, so the size of deleted records are not that bad, only their meta data are left in the storage.

yrliou commented 4 years ago

Another thought is that we can always set TTL when creating/updating the item, and put them into cold storage when that expired. And of course we need to be able to get those items back when needed by our server.

yrliou commented 4 years ago

We could also consider saving specifics binary in S3 instead of directly in Dynamo too.