Closed electrum closed 3 years ago
Here's output from EMRFS's implementation:
2018-10-09T16:06:35.038Z INFO s3n-worker-0 com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream uploadPart /mnt/s3/emrfs-4740676305924555881/0000000000 134217728 bytes md5: sE/RxCYZHamxtLjCxVaG8g== md5hex: b04fd1c426191da9b1b4b8c2c55686f2
2018-10-09T16:07:22.323Z INFO s3n-worker-2 com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream uploadPart /mnt/s3/emrfs-4740676305924555881/0000000001 134217728 bytes md5: 6cMghOZcHT6t5qTSF0V7Tg== md5hex: e9c32084e65c1d3eade6a4d217457b4e
2018-10-09T16:08:12.407Z INFO s3n-worker-4 com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream uploadPart /mnt/s3/emrfs-4740676305924555881/0000000002 134217728 bytes md5: 5YOpDGHWcKmqdCBhcn06Fw== md5hex: e583a90c61d670a9aa742061727d3a17
2018-10-09T16:08:58.357Z INFO s3n-worker-6 com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream uploadPart /mnt/s3/emrfs-4740676305924555881/0000000003 134217728 bytes md5: JDZ5w9aCVHt2Bg6mRsTS7g== md5hex: 243679c3d682547b76060ea646c4d2ee
2018-10-09T16:09:09.584Z INFO 20181009_160258_00005_gfa7w.1.0-16-49 com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream close closed:false s3://bucket/tmp/presto-user/guid/date=2018-10-08/20181009_160258_00005_gfa7w_bucket-00000
2018-10-09T16:09:09.585Z INFO s3n-worker-8 com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream uploadPart /mnt/s3/emrfs-4740676305924555881/0000000004 42345186 bytes md5: 1XJ2pc6Te5iBjPXhcYwXYw== md5hex: d57276a5ce937b98818cf5e1718c1763
2018-10-09T16:09:10.679Z INFO 20181009_160258_00005_gfa7w.1.0-16-49 com.amazon.ws.emr.hadoop.fs.s3.upload.dispatch.DefaultMultipartUploadDispatcher Completed multipart upload of 5 parts 579216098 bytes
It looks like they're behaving exactly as you described, segmenting the final output into chunks and uploading each as a multipart part.
For what it's worth, there was no measurable wall clock time difference between Presto's current S3 implementation and EMRFS. For files this size, the upload time is pretty negligible, at least when running in AWS.
@ddrinka Thanks for the confirmation. Even if the upload at the end is fast, we might use a considerable amount of local disk (or even run out). This approach would only require up to 5GB (the S3 chunk size) per open writer.
This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.
We could add streaming support to
PrestoS3OutputStream
by changing it to manually perform a multi-part upload that runs for the duration of the write operation. Rather than having a single temporary file that is uploaded at the end, it would create multiple files of the multipart size limit, then initiate the upload of each one as it is ready. Users would probably want to set theAbortIncompleteMultipartUpload
bucket lifecycle policy to cleanup uploads in cases where Presto workers crash or otherwise cannot abort them on failure.