amazon-archives / dynamodb-import-export-tool

Exports DynamoDB items via parallel scan into a blocking queue, then consumes the queue and import DynamoDB items into a replica table using asynchronous writes.
Apache License 2.0
90 stars 38 forks source link

Fix multinode export. #4

Open wickman opened 8 years ago

wickman commented 8 years ago

Multi-node export using --section/--totalSections is currently broken. ParallelScanExecutor creates a BitSet to track which segments are completed, and terminates when finished.cardinality() == workers.length which can only be true if totalSections is 1. If totalSections > 1, then finished.cardinality() will always be less than workers.length. While the code does eventually copy the entire table, it never terminates.

This change is sort of a hacky way to unblock me. Probably the right thing is to have ParallelScanExecutor have an expected bitset (the segments that should be completed) and a finished bitset of which ones have completed, and compares the two.