Using amazon.aws.s3_object copy mode with objects that were uploaded in multiple parts (e.g. as happens with uploads via the web UI) results in the objects being copied every time the module is used - including when the corresponding objects exists in the source bucket with the same content.
I suspect that the issue is due to how the Etag is generated for the source vs how it gets generated for the copy.
Create a file, where size(file) < 5GB (the copy limit), e.g. head -c 64MB < /dev/zero > 64MB-zero.bin
Perform a multipart upload of the file to an S3 bucket. If you use the web UI, that appears to upload it in 16MB parts.
Observe the Etag, e.g. for the 64MB zero file, I have 05c46bd967d2892191397a04e43821b9-4. According to Amazon:
Amazon S3 calculates the MD5 digest of each individual part. MD5 digests are used to determine the ETag for the final object. Amazon S3 concatenates the bytes for the MD5 digests together and then calculates the MD5 digest of these concatenated values. The final step in creating the ETag is when Amazon S3 adds a dash with the total number of parts to the end.
Use amazon.aws.s3_object to copy the file to another bucket.
- name: copy file that was uploaded in parts
amazon.aws.s3_object:
bucket: target-bucket
mode: copy
copy_src:
bucket: source-bucket
prefix: 64MB-zero.bin
Repeat the above and observe the task shows change each time (i.e. it's not idempotent). The timestamp on the target object is updated but the contents are not, suggesting a copy operation occurred needlessly.
Observe the etag on the target bucket does not match that of the source, namely it is not in multi-part form, e.g. the zeros file has etag: e78585b8bfda6036cfd818710a210f23 (MD5 of 64MB of zeros).
Expected Results
Module is idempotent, and does not repeatidly copy identical files.
Actual Results
Copy operation performed on files uploaded in multipart, regardless of their state in the target bucket.
Summary
Using
amazon.aws.s3_object
copy mode with objects that were uploaded in multiple parts (e.g. as happens with uploads via the web UI) results in the objects being copied every time the module is used - including when the corresponding objects exists in the source bucket with the same content.I suspect that the issue is due to how the Etag is generated for the source vs how it gets generated for the copy.
Issue Type
Bug Report
Component Name
s3_object
Ansible Version
Collection Versions
AWS SDK versions
Configuration
OS / Environment
MacOS 14.3 (23D56)
Steps to Reproduce
size(file) < 5GB
(the copy limit), e.g.head -c 64MB < /dev/zero > 64MB-zero.bin
05c46bd967d2892191397a04e43821b9-4
. According to Amazon:amazon.aws.s3_object
to copy the file to another bucket.e78585b8bfda6036cfd818710a210f23
(MD5 of 64MB of zeros).Expected Results
Module is idempotent, and does not repeatidly copy identical files.
Actual Results
Copy operation performed on files uploaded in multipart, regardless of their state in the target bucket.
Code of Conduct