tus / tusd

Reference server implementation in Go of tus: the open protocol for resumable file uploads
https://tus.github.io/tusd
MIT License
3.09k stars 482 forks source link

Question: Is the S3 store implementation sending an MD5 hash? #1215

Closed jerebear12 closed 1 week ago

jerebear12 commented 2 weeks ago

Question

Is the S3Store implementation setting the ContentMD5 header to validate the integrity of uploads uploaded using the S3 store?

To me, it looks like the putPartForUpload function in the s3store.go file is not setting the ContentMD5 property on the uploadPartInput parameter.

In the Go S3 SDK docs I see this here:

"Data integrity General purpose bucket - To ensure that data is not corrupted traversing the network, specify the Content-MD5 header in the upload part request. Amazon S3 checks the part data against the provided MD5 value. If they do not match, Amazon S3 returns an error. If the upload request is signed with Signature Version 4, then Amazon Web Services S3 uses the x-amz-content-sha256 header as a checksum instead of Content-MD5 . For more information see Authenticating Requests: Using the Authorization Header (Amazon Web Services Signature Version 4)."

I am interpreting "specify the Content-MD5 header" as it has to be calculated and set on the request object.

Am I misunderstanding something here?

Setup details

Eyeballing the code using VS Code.

jerebear12 commented 2 weeks ago

Just saw https://github.com/tus/tusd/issues/1187#issuecomment-2355633486. It appears as if this is not implemented for any of the stores.

Acconut commented 2 weeks ago

The S3Store uses the AWS SDK for Go v2 which adds these integrity headers by default to requests sent to AWS. Optionally, you can disable the digests via -s3-disable-content-hashes although this is not recommended.

Just saw #1187 (comment). It appears as if this is not implemented for any of the stores.

This issue is unrelated to your question. The linked issue discusses users supplying their using digests, which tusd should check. Its not about checksums used in communication to storage services.