Open Krobar opened 4 years ago
Since we use multipart upload, object ETag changes if user changes part-size of a file. Relevant package: https://github.com/peak/s3hash/
It's not as safe as hash control, but cp -n -s
practically does the same job for use-cases like this.
Duplicate of #43
Thank you for the reply. I tried -s and it doesn't quite work for this use case. The reason is if I make a minor change to the page output (eg. Capitalise a letter) then the size does not change and it does not upload. -n is not appropriate for this use case as the generated files always have a new modified date than the previous files.
I don't think the ETag is reliable these days as it is no longer contains an MD5 hash of the upload. Some other (much slower) S3 utilities add a custom MD5 tag and check for this; this is not perfect but would work perfectly for this use case. Would be good if it could be considered.
ETag isn't reliable. aws s3 sync
has been reportedly broken for years as it doesn't guarantee an actual sync. See https://github.com/aws/aws-cli/issues/3273
Can the approach taken by s4cmd
not be used here?
https://github.com/bloomreach/s4cmd#additional-technical-notes
Would be great if this option could be added. I know it requires a custom metadata addition but it would be really useful.
Use Case: Using for copy of static site generator output to S3. S5cmd is way faster than alternatives but unfortunately copies files that don't need updating which makes it more expensive.