boto / s3transfer

Amazon S3 Transfer Manager for Python
Apache License 2.0
209 stars 133 forks source link

Specify VersionId when possible on multipart downloads #254

Open mspdt22 opened 1 year ago

mspdt22 commented 1 year ago

This addresses #86 for versioned buckets

cfxegbert commented 1 year ago

How is this different than using VersionId in the extra_args parameter? VersionId is already in the ALLOWED_DOWNLOAD_ARGS list.

One useful feature is grabbing the version id at the beginning (if available) and using that version id throughout the multipart download.

mspdt22 commented 1 year ago

Thank you for taking a look!

How is this different than using VersionId in the extra_args parameter?

To clarify, are you proposing that the s3transfer library plumbs the VersionId from the HEAD request it already makes to get object size? Or that the s3transfer users themselves are responsible for making the HEAD request outside the s3transfer library and passing it to download_file()?

One useful feature is grabbing the version id at the beginning (if available) and using that version id throughout the multipart download.

Yes, agreed. This PR is adding the feature.

cfxegbert commented 1 year ago

Thank you for taking a look!

FYI, I do not have any permissions to do any sort of approval on this repo.

How is this different than using VersionId in the extra_args parameter?

To clarify, are you proposing that the s3transfer library plumbs the VersionId from the HEAD request it already makes to get object size? Or that the s3transfer users themselves are responsible for making the HEAD request outside the s3transfer library and passing it to download_file()?

Looking at the code the real bug is that extra_args are not passed along in MultipartDownload. If that is fixed, then VersionId can be passed as an extra_arg.

One useful feature is grabbing the version id at the beginning (if available) and using that version id throughout the multipart download.

Yes, agreed. This PR is adding the feature.

Looking at the documentation for VersionId this would be a breaking change because GetObjectVersion permission is required. A non-breaking change would be to extract the ETag from the head_object and use it with the IfMatch header.