Open chillu opened 7 years ago
OK Data Pipelines is crazy expensive in its default config: 1 hour of EC2 c4.large plus EMR = ~0.12USD per run. And the pipeline is configured to run for one table only. https://aws.amazon.com/emr/pricing/
Looking into alternatives: http://stackoverflow.com/questions/18896329/export-data-from-dynamodb
https://github.com/Purple-Unicorns/DynamoDbBackUp looks like a winner. We'll need to enable streams on the tables and versions on the s3 bucket. DynamoDB itself doesn't have version tracking
While DynamoDB replicates into at least three nodes, it doesn't protect us from logic errors. Particularly with "delete photo" and "unfriend" actions, a logic error could overwrite or wipe out data accidentally.
DynamoDB: https://aws.amazon.com/blogs/aws/cross-region-import-and-export-of-dynamodb-tables/
S3: Probably just a matter of implementing bucket versioning