streamingfast / merger

Apache License 2.0
4 stars 5 forks source link

merger should upload blocks in background and work on next block #26

Closed matthewdarwin closed 4 months ago

matthewdarwin commented 1 year ago

uploading blocks to s3 is sometimes slow. merger appears to do nothing until the merged block is uploaded, and then merges the next block.

merger would run much faster if it merged the next block while the previous block was still uploading. ie there should be an upload queue of merged blocks.

This issue occurs when you have a lot (1000s) of one-block files accumulated and merger can run at 100% CPU.

matthewdarwin commented 1 year ago

Workaround for this issue: have merger write files to local disk and manually send them to s3 later.