Parallel uploads for multipart

smallhive commented 2 weeks ago

Multipart uploads don't work in general case

Current Behavior

AWS SDK uploads parts for multipart in 5 parallel threads. The gate expects parts subsequently one by one

Expected Behavior

All parts should be uploaded in any order

Possible Solution

Collect the final object hash in a different way

Steps to Reproduce

Create multipart upload with any available tool
Upload parts in not subsequent order and get OperationAborted: 409 error

Context

Related to #1016

Your Environment

Version of the product used: 59db6be2c65b89aaa94047d98cb1fb38bc4393c3

roman-khimov commented 1 week ago

The problem is not the hash, really. The problem is split chain itself and v1/v2 are the same here, pre-#957 code didn't support this as well. Chaining is this previous reference we have, it's designed for streams and it's really good for this use case, but S3 multipart is not about streaming one part after another, it treats multipart upload as a number of independent slots and, most importantly, real applications do use this property.

Potential solutions:

upload to temporary objects, reassemble at CompleteMultipart
change split scheme to support "slots" and (optionally) not rely on chain

Temporary objects require additional logic on S3 side and can leave garbage that is harder to trace (additional attributes?). They can be optional (we can try pushing the next split chunk if possible and resort to additional objects if the part is out of sequence). And they will seriously affect multipart completion, it will require quite some time to reslice everything (hashing alone would be much easier, but that's not the problem we have).

Supporting "slots" can be some additional "part number" attribute that is used instead of "previous". It completely breaks backward walking assembly logic and makes link objects more important, but it's still a possibility and we still can find all related objects this way. It can also simplify part reuploading. At the same time it's a protocol change. Can this be useful for standalone NeoFS? Not sure.

@carpawell?

carpawell commented 1 week ago

From the NeoFS side, I see some questions, and the main one is if we can solve them successfully, why have we needed this backward-chained logic from the beginning for so long if we can accept a simpler scheme (but based on some agreements that should be taken as truth)?

"makes link objects more important": who builds this object? It seems like uploading parts to different nodes is an obvious application, so who completes the link object and when?
"It can also simplify part reuploading.": how do we "update" a link object then? Also reupload it? So it should be dynamic then and the chain objects should not fix their relation to a link object?
"it treats multipart upload as a number of independent slots": how real is this case? Is it so much required to change storage nodes but not gateways?

Don't mind considering protocol changes but for now to me it is more like trying to play against NeoFS and figuring out some kludges about it.

roman-khimov commented 6 days ago

Chained objects are more robust and they're very good for streams of data. Typical NeoFS slicing pattern is exactly that, you know previous object hash, you know all the hashes, you can build these links and indexes effectively and you can always follow the chain exactly.

Slot-alike structure is more fragile, it's not simpler, without an index object it requires searches to find other parts. Also, regarding its use for S3 one thing to keep in mind is that probably we can't ensure 1:1 slot mapping between NeoFS and S3, since parts there are 5 MB to 5 GB and 5GB is a big (split) object in NeoFS. Split hierarchies is something we've long tried to avoid and I'd still try to so.

Unfortunately, looks like this limits us to some S3-specific scheme with regular objects that are then reassembled upon upload completion. Which totally destroys the optimization we have now (almost free multipart upload completion). I'm all ears for other ideas.

nspcc-dev / neofs-s3-gw