Description of changes:Tokio.spawn doesn't respect the spawn order, which can result in us downloading the first num_concurrency parts in random order. For a workload of 5GB * 100 files, this can lead to very high memory usage, as seen in the diagram below. This PR refactors the exact part to be determined only once the task has been scheduled.
Uploads can also have a similar issue where we read too many parts into memory. To fix that, we will need to refactor our scheduler to be smarter so that we only read the part when we have the permit. (Created: https://github.com/awslabs/aws-s3-transfer-manager-rs/issues/60)
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Description of changes:
Tokio.spawn
doesn't respect the spawn order, which can result in us downloading the firstnum_concurrency
parts in random order. For a workload of 5GB * 100 files, this can lead to very high memory usage, as seen in the diagram below. This PR refactors the exact part to be determined only once the task has been scheduled.Uploads can also have a similar issue where we read too many parts into memory. To fix that, we will need to refactor our scheduler to be smarter so that we only read the part when we have the permit. (Created: https://github.com/awslabs/aws-s3-transfer-manager-rs/issues/60)
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.