apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.76k stars 4.21k forks source link

[Bug]: Cannot use S3 with Cloudflare R2. #29005

Open BlakeB415 opened 11 months ago

BlakeB415 commented 11 months ago

What happened?

https://developers.cloudflare.com/r2/objects/multipart-objects/ "All parts except the last one must be the same size. The last part has no minimum size, but must be the same or smaller than the other parts."

I get this error when using Beam with Cloudflare R2 S3 API. There is seemingly no way to tell Beam to use consistent part sizes.

apache_beam.io.aws.clients.s3.messages.S3ClientError: ("An error occurred (InvalidPart) when calling the CompleteMultipartUpload operation: All non-trailing parts must have the same length. [while running 'WriteSplit[train]/Write/Write/WriteImpl/WriteBundles']", 400) 

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

tvalentyn commented 10 months ago

Thanks! It might help to leave a minimal example that reproduces this error.