storj / roadmap

Storj Public Roadmap
Other
11 stars 4 forks source link

S3 Compatibility - UploadPartCopy #40

Open ferristocrat opened 2 years ago

ferristocrat commented 2 years ago

Background

What is the problem/pain point?

Many S3 libraries such as boto3 for Python set a object size threshold, after which, uploads and copies will default to multipart upload. Given that we do not currently support multipart copy (link to that endpoint) this default threshold will return an error.

What is the impact?

Customers expect their integrations to “just work” when Storj advertises S3 Compatibility. This feature gap frustrates customers as they’re onboarding and if not resolved, we will lose business.

Why now?

Customers are asking for this and it helps round out our compatibility with the “core” features of the S3 API.

Requirements

Assumptions

Out of Scope

User Story

As a Storj DCS User I want to be able to use UploadPartCopy to copy large objects already within a Storj DCS bucket so that I can switch from my existing S3 compatible storage to Storj without having to change my multipart copy configuration.

S3 Method: UploadPartCopy - Uploads a part by copying data from an existing object as data source.

Acceptance Criteria

Request Params

Param Description Required? Support Needed?
Bucket The bucket name Yes
Key Object key for which the multipart upload was initiated. Yes
partNumber Part number of part being copied. This is a positive integer between 1 and 10,000. Yes
uploadId Upload ID identifying the multipart upload whose part is being copied. Yes
x-amz-copy-source Specifies the source object for the copy operation. Yes
x-amz-copy-source-if-match Copies the object if its entity tag (ETag) matches the specified tag. No
x-amz-copy-source-if-modified-since Copies the object if it has been modified since the specified time. No
x-amz-copy-source-if-none-match Copies the object if its entity tag (ETag) is different than the specified ETag. No
x-amz-copy-source-if-unmodified-since Copies the object if it hasn't been modified since the specified time. No
x-amz-copy-source-range The range of bytes to copy from the source object. The range value must use the form bytes=first-last, where the first and last are the zero-based byte offsets to copy. For example, bytes=0-9 indicates that you want to copy the first 10 bytes of the source. You can copy a range only if the source object is greater than 5 MB. Yes
x-amz-copy-source-server-side-encryption-customer-algorithm Specifies the algorithm to use when decrypting the source object (for example, AES256). No
x-amz-copy-source-server-side-encryption-customer-key Specifies the customer-provided encryption key for Amazon S3 to use to decrypt the source object. No
x-amz-copy-source-server-side-encryption-customer-key-MD5 Specifies the 128-bit MD5 digest of the encryption key according to RFC 1321. No
x-amz-expected-bucket-owner The account ID of the expected destination bucket owner. No
x-amz-request-payer Confirms that the requester knows that they will be charged for the request. No
x-amz-server-side-encryption-customer-algorithm Specifies the algorithm to use to when encrypting the object (for example, AES256). No
x-amz-server-side-encryption-customer-key Specifies the customer-provided encryption key for Amazon S3 to use in encrypting data. No
x-amz-server-side-encryption-customer-key-MD5 Specifies the 128-bit MD5 digest of the encryption key according to RFC 1321. No
x-amz-source-expected-bucket-owner The account ID of the expected source bucket owner. No

Response Elements


Element Description Required? Support Needed? Notes
CopyPartResult Root level tag for the CopyPartResult parameters. Yes  
ETag Entity tag of the object. Maybe Maybe  
LastModified Date and time at which the object was uploaded. Maybe Maybe  
x-amz-copy-source-version-id The version of the source object that was copied, if you have enabled versioning on the source bucket. No Do not yet support object versioning
x-amz-request-charged If present, indicates that the requester was successfully charged for the request. No  
x-amz-server-side-encryption The server-side encryption algorithm used when storing this object in Amazon S3 (for example, AES256, aws:kms) No  
x-amz-server-side-encryption-aws-kms-key-id If present, specifies the ID of the AWS Key Management Service (AWS KMS) symmetric encryption customer managed key that was used for the object. No  
x-amz-server-side-encryption-bucket-key-enabled Indicates whether the multipart upload uses an S3 Bucket Key for server-side encryption with AWS KMS (SSE-KMS). No  
x-amz-server-side-encryption-customer-algorithm If server-side encryption with a customer-provided encryption key was requested, the response will include this header confirming the encryption algorithm used. No  
x-amz-server-side-encryption-customer-key-MD5 If server-side encryption with a customer-provided encryption key was requested, the response will include this header to provide round-trip message integrity verification of the customer-provided encryption key. No  
ChecksumCRC32 The base64-encoded, 32-bit CRC32 checksum of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. No  
ChecksumCRC32C The base64-encoded, 32-bit CRC32C checksum of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. No  
ChecksumSHA1 The base64-encoded, 160-bit SHA-1 digest of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. No  
ChecksumSHA256 The base64-encoded, 256-bit SHA-256 digest of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. No  

Measures of Success

Useful Links

shaupt131 commented 2 years ago

Week 25 update

Completed:

In Progress:

ferristocrat commented 2 years ago

Estimation ticket: https://github.com/storj/storj/issues/4875

dominickmarino commented 3 months ago

This issue appears when a >5gb file is moved on the source which in the following reports was on TrueNAS systems while they are using the TrueNAS Storj integration via the sync option. Multipart uploads are enabled and Serverside copy is triggered.

Convo https://storj.slack.com/archives/C03CD69JF6J/p1682359306889639

Old Report (April 23') "3>ERROR : Backups/Backup/Adobe File Backups/Shawn Adobe Data1.zip: Failed to set modification time: NotImplemented: A header you provided implies functionality that is not implemented status code: 501, request id: 17577FA900427860, host id:"

New Report https://supportdcs.zendesk.com/agent/tickets/29781