naschorr / dynamodb-table-copier

Copies a table's data from one DynamoDB instance to another
0 stars 0 forks source link

Optimize the `batch_write` operation #1

Open naschorr opened 2 years ago

naschorr commented 2 years ago

Writing the DynamoDB specified 25 items per batch is super slow, however the docs mention that using multiple threads for the batch operation is totally fine.

Alternatively, the native batch_writer seems to be able to handle many more than 25 items at once. More experimentation is needed.

naschorr commented 2 years ago

As a follow up, it looks like even if you pass batch_writer N > 25 items, it still processes it in chunks of 25. That said (anecdotally), I did notice that data copying seemed to be happening slightly quicker. Maybe a couple minutes faster per chunk of ~7000 items from the AWS instance.

That said, I didn't robustly profile the operation, so it could just be a fluke.