amazon-archives / dynamodb-import-export-tool

Exports DynamoDB items via parallel scan into a blocking queue, then consumes the queue and import DynamoDB items into a replica table using asynchronous writes.
Apache License 2.0
90 stars 38 forks source link

ExecutorCompletionService produces OOM #6

Open marcosnils opened 8 years ago

marcosnils commented 8 years ago

If consumers process slower than producers then you'll end up with a memory leak as ExecutorCompletionService created an unbounded LinkedBlockingQueue by default.

https://github.com/awslabs/dynamodb-import-export-tool/blob/623c333a7726cb80b01f2723a44a00951ec3cc64/src/main/java/com/amazonaws/dynamodb/bootstrap/BlockingQueueConsumer.java#L42

Also, the underlying ThreadPoolExecutor doesn't have a bounded blocking queue, which also causes OOM as infinite tasks can be scheduled into the executor:

https://github.com/awslabs/dynamodb-import-export-tool/blob/623c333a7726cb80b01f2723a44a00951ec3cc64/src/main/java/com/amazonaws/dynamodb/bootstrap/BlockingQueueConsumer.java#L41

Final question:

What's the purpose of having an ExecutorCompletionService here (https://github.com/awslabs/dynamodb-import-export-tool/blob/623c333a7726cb80b01f2723a44a00951ec3cc64/src/main/java/com/amazonaws/dynamodb/bootstrap/BlockingQueueConsumer.java#L42) if take or poll is never called?

jseed commented 8 years ago

+1 to this issue. Using this tool on any of our larger tables generates an OutOfMemoryError, rendering it pretty much useless