prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.06k stars 5.38k forks source link

Support streaming for S3 writes #11675

Closed electrum closed 3 years ago

electrum commented 6 years ago

We could add streaming support to PrestoS3OutputStream by changing it to manually perform a multi-part upload that runs for the duration of the write operation. Rather than having a single temporary file that is uploaded at the end, it would create multiple files of the multipart size limit, then initiate the upload of each one as it is ready. Users would probably want to set the AbortIncompleteMultipartUpload bucket lifecycle policy to cleanup uploads in cases where Presto workers crash or otherwise cannot abort them on failure.

ddrinka commented 6 years ago

Here's output from EMRFS's implementation:

2018-10-09T16:06:35.038Z        INFO    s3n-worker-0    com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream     uploadPart /mnt/s3/emrfs-4740676305924555881/0000000000 134217728 bytes md5: sE/RxCYZHamxtLjCxVaG8g== md5hex: b04fd1c426191da9b1b4b8c2c55686f2
2018-10-09T16:07:22.323Z        INFO    s3n-worker-2    com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream     uploadPart /mnt/s3/emrfs-4740676305924555881/0000000001 134217728 bytes md5: 6cMghOZcHT6t5qTSF0V7Tg== md5hex: e9c32084e65c1d3eade6a4d217457b4e
2018-10-09T16:08:12.407Z        INFO    s3n-worker-4    com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream     uploadPart /mnt/s3/emrfs-4740676305924555881/0000000002 134217728 bytes md5: 5YOpDGHWcKmqdCBhcn06Fw== md5hex: e583a90c61d670a9aa742061727d3a17
2018-10-09T16:08:58.357Z        INFO    s3n-worker-6    com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream     uploadPart /mnt/s3/emrfs-4740676305924555881/0000000003 134217728 bytes md5: JDZ5w9aCVHt2Bg6mRsTS7g== md5hex: 243679c3d682547b76060ea646c4d2ee
2018-10-09T16:09:09.584Z        INFO    20181009_160258_00005_gfa7w.1.0-16-49   com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream     close closed:false s3://bucket/tmp/presto-user/guid/date=2018-10-08/20181009_160258_00005_gfa7w_bucket-00000
2018-10-09T16:09:09.585Z        INFO    s3n-worker-8    com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream     uploadPart /mnt/s3/emrfs-4740676305924555881/0000000004 42345186 bytes md5: 1XJ2pc6Te5iBjPXhcYwXYw== md5hex: d57276a5ce937b98818cf5e1718c1763
2018-10-09T16:09:10.679Z        INFO    20181009_160258_00005_gfa7w.1.0-16-49   com.amazon.ws.emr.hadoop.fs.s3.upload.dispatch.DefaultMultipartUploadDispatcher Completed multipart upload of 5 parts 579216098 bytes

It looks like they're behaving exactly as you described, segmenting the final output into chunks and uploading each as a multipart part.

For what it's worth, there was no measurable wall clock time difference between Presto's current S3 implementation and EMRFS. For files this size, the upload time is pretty negligible, at least when running in AWS.

electrum commented 5 years ago

@ddrinka Thanks for the confirmation. Even if the upload at the end is fast, we might use a considerable amount of local disk (or even run out). This approach would only require up to 5GB (the S3 chunk size) per open writer.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.