Open Niedzwiedzw opened 1 year ago
switching to "normal" upload (not-multipart) fixes the issue and it's fine for me
@Niedzwiedzw I've ran into this with tests, I'm not sure if R2 supports multipart upload, it might also be "too eventually consistent", so the upload parts don't get created by the time we check for them all
it happens every time, and also I've asked on cloudflare's discord - they suggested it's to do with concurrency (max 2 uploads at a time)
Interesting so do you think something like throttling R2 uploads would help?
Did some investigation into this:
I modified the rust-s3 source code to panic after initiating the multipart upload:
dbg!(s3_path);
let msg = self
.initiate_multipart_upload(s3_path, content_type)
.await?;
let path = msg.key;
let upload_id = &msg.upload_id;
dbg!(&upload_id);
dbg!(self.list_multiparts_uploads(Some(""), Some("")).await?);
panic!("aborting here");
This way I can have a look at the ongoing multipart uploads being produced by the code.
The multipart uploads definitely exist & they're immediately visible in that dbg! call at the end of that code snippet. R2 is strongly consistent. Once you get confirmation of the MultipartUpload creation back, that multipart upload is immediately visible everywhere.
I don't think this has anything to do with concurrency limits since this also happens when doing a streaming upload with just 2 parts. You can initiate such an upload by just creating a file of 15 MiB, which automatically gets split into 2 chunks by the implementation. With just 2 parts in a multipart upload, it should be impossible to run into R2's multipartUpload concurrency limits (which will be lifted soon).
This is as far as I got with my investigation, but I hope that helps. Inspecting the raw http requests you're sending with reqwest might be of help here. My best guess is that you might be altering the key or uploadID when sending subsequent requests interacting with the ongoing multipart upload.
I hope that's helpful in figuring this out. If you think R2 is incompatible with S3 as documented here, instead of rust-s3 relying on an implementation detail, I'd love to hear about it.
P.s. Is it of note that i needed self.list_multiparts_uploads(Some(""), Some(""))
instead of the self.list_multiparts_uploads(Some("/"), Some("/")).await?);
from the docs example to list all multipart uploads? Might that point to mismatched expectations between rust-s3 and R2?
@Frederik-Baetens thanks for digging in, I can’t say I have much inside, the only argument I have is that multipart uploads seem to work on s3, but not in r2, so there has to be some inconsistency. I’ll try and find some time to dig into this…
I also noticed that the ongoing multipart uploads strip off the / at the beginning of the key. E.g. if i upload /hi, the multipart upload will just be for "hi". Could it be that some part of your code is stripping the /, and another is expecting it to always be there, with the expectation that the S3 compatible object storage implementation maps both versions to the same file?
That’s an interesting observation, encoding of slashes has been a contentious topic at one point, there even was a dedicated feature for handling it, I think you might be on to something here
Describe the bug A clear and concise description of what the bug is.
To Reproduce
Environment
rustc 1.66.0-nightly (9062b780b 2022-09-21)
Additional context It used to work for DigitalOcean, but I've migrated so not sure if it's that or if it's some change in the library itself... I'll keep debugging