Introduce a new API that can handle reading in parallel from it.
Added an implementation to create parallel input stream from file
Changed the S3 client to create parallel input stream if the upload source is a file
Use all the io eventloop group to prepare request if parallel input stream is available
Increased the upload performance if the client is handling less meta request then the number of threads we created for io event.
For a single upload of a 30GiB file, we increased the speed to read the whole read from 14 secs to around 6 secs by this PR.
Quick thinks:
Why are we getting 6 secs to read from disk(cached)? Pure reading from cached file with 8 thread can do faster
Our client doesn't read the file as fast as possible, reading from file is probably NOT the bottle neck if we read in parallel and we have the file cached.
Why use 8 threads to read?
From my testing, 8 threads can read the file in 236 Gbps, faster than the network bandwidth we have, so, it's a reasonable number to avoid file read as the bottle neck.
I tested to use 32 threads to read, I don't see any improvement from our client. The tracing shows the IO threads still worked around 6 secs. Again, I think with parallel read and cached file, the reading is no longer the bottle neck.
From my testing with just reading, around 32 threads will be able to reach the best performance. However, even with 1 thread, the reading throughput is higher than what we currently have.
[ec2-user@ip-172-31-14-35 build]$ ./s3-benchrunner-c 1
Total execution time: 4.76 seconds
total read_Gib 240
Read through_put: 50.38 Gbps
[ec2-user@ip-172-31-14-35 build]$ ./s3-benchrunner-c 8
Total execution time: 1.01 seconds
total read_Gib 240
Read through_put: 236.49 Gbps
[ec2-user@ip-172-31-14-35 build]$ ./s3-benchrunner-c 16
Total execution time: 0.56 seconds
total read_Gib 240
Read through_put: 424.87 Gbps
[ec2-user@ip-172-31-14-35 build]$ ./s3-benchrunner-c 32
Total execution time: 0.51 seconds
total read_Gib 240
Read through_put: 471.96 Gbps
Will parallel read affect multiple meta requests?
I did a test with 24 concurrent upload with the same 30 GiB file. The throughput doesn't change much. From the main branch, I got ~54Gbps, from this branch, I got ~54Gbps as well. Note: the max-throughput benchmark, I got ~55Gbps
Increased the upload performance if the client is handling less meta request then the number of threads we created for io event.
For a single upload of a 30GiB file, we increased the speed to read the whole read from 14 secs to around 6 secs by this PR.
Quick thinks:
54Gbps
, from this branch, I got ~54Gbps
as well. Note: the max-throughput benchmark, I got ~55Gbps
TODO:
mmap
instead? Will it help to improve the throughput?~ -> https://github.com/awslabs/aws-c-s3/pull/354By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.