shevchenkos / DynamoDbBackUp

46 stars 23 forks source link

process changes from a dynamodb stream sequentially #29

Closed smelchior closed 7 years ago

smelchior commented 7 years ago

We ran into an issue with the way the dynamodb streams are handled in the backup application.

In our setup we process many changes to a single dynamodb record in a small amount of time. Usually these changes are all included within a single lambda function run. When the stream is processed, the stream records are placed in the allRecords[keyid] map in fromDbStream. This map is ordered, so the current version is the last element. The dbRecord.backup method it is passed to, processes the records with a promise and resolves it with Promise.all(). The order of the map is not honoured in this case. What we saw on our version-enabled bucket in S3 was, that all the different stages of the record where saved, but the last state did not necessarily match the one in the DynamoDB table. As our records are only written once and usually are not modified later, the backup file is never overwritten.

I modified dbRecord.backup to enable it to process the elements sequentially, so it is guaranteed that the last element in the stream for that keyId is saved in S3 at the end.

I preserved the "old" behaviour of dbRecord.backup as it is also used for the full backups. In this case the order of processing does not matter, as each keyid is only passed once.

I hope you understand what our issue was, if there are any questions, let me know :)!

shevchenkos commented 7 years ago

Thanks. Great work! I'll merge in the morning(by CET).

shevchenkos commented 7 years ago

Merged and published.