irods / irods_client_s3_api

C++ S3 API for iRODS
BSD 3-Clause "New" or "Revised" License
0 stars 5 forks source link

Multipart enhancements for efficiency #114

Open JustinKyleJames opened 1 month ago

JustinKyleJames commented 1 month ago

Enhance multipart uploads by doing the following:

putobject.cpp

  1. When a part upload request comes in, store the part size in a data structure that looks like the following:
{                                                                                                                                                                                                                                                                                                                    
  "1234abcd-1234-1234-123456789abc":                                                                                                           
    { 0: 4096000,                                       
      1: 4096000,                                       
      4: 4096000 },                                   
  "01234abc-0123-0123-0123456789ab":                                                                    
    { 0: 5192000,                                                
      3: 5192000 }
}

The outer map has the upload_id as the keys. The inner map has the part number as the key and part size as the value.

  1. Prior to streaming the data for a part, determine if we know all previous part sizes. If so, we know the offset for this part and can stream directly to iRODS instead of creating a local part file. If we don't then write the data to a local part file.

completemultipartupload.cpp

Only read the parts that have corresponding part files and write them to iRODS. The other parts should have already been written.

Delete the entry for the upload_id in the shared memory once CompleteMultipart returns. (This should also be done when CancelMultipartUpload is implemented.)

JustinKyleJames commented 1 month ago

A couple of notes on the implementation.

  1. If someone resends a part with a different size before CompleteMultipartUpload is called, this could cause issues. I think we just have to say that this will not be supported. I believe I can detect the part size change and reject it.

  2. The entry in shared memory is cleaned up when CompleteMultipartUpload returns. It will also be cleaned up if CancelMultipartUpload is called once that endpoint is implemented. However, if neither is called it won't be cleaned up until the server restarts. A new call to InitiateMultipartUpload would generate a new upload_id so that upload should behave correctly.

korydraughn commented 3 weeks ago
  1. If someone resends a part with a different size before CompleteMultipartUpload is called, this could cause issues. I think we just have to say that this will not be supported. I believe I can detect the part size change and reject it.

Hmm, there may be cases where someone wants to support resending of parts. Perhaps we should introduce a config option that allows the admin to change the behavior of the S3 API. This would be similar to the 4.2 compatibility option.

Do you know if the clients expose options for disabling the resending of parts?

  1. The entry in shared memory is cleaned up when CompleteMultipartUpload returns. It will also be cleaned up if CancelMultipartUpload is called once that endpoint is implemented. However, if neither is called it won't be cleaned up until the server restarts. A new call to InitiateMultipartUpload would generate a new upload_id so that upload should behave correctly.

If CompleteMultipartUpload and CancelMultipartUpload are never called, can we determine if the upload id is dead (i.e. not going to be used)? My guess is no. We'd probably have to rely on some amount of inactivity, which means we'd have to track time.

If we don't clean up, then we've effectively lost that memory for the lifetime of the S3 API process.

Worst case scenario is the network is unstable and multiple users have to restart transfers which lead to the shared memory being exhausted.

I think we need a sweeper task to handle the clean up.

Thoughts?

trel commented 3 weeks ago

1) Not sure we've ever seen this... it's a client-side choice... and we have little/no experience in the wild, for now. Shipping a 'reject' seems okay at the beginning, along with a big fat log message that explains the situation to an admin, and tells them to complain to us.

2) If we can't come up with a better metric/indicator... a sweeper with a timing configuration for the admin seems... reasonable as a backstop.

korydraughn commented 3 weeks ago
  1. Not sure we've ever seen this... it's a client-side choice... and we have little/no experience in the wild, for now. Shipping a 'reject' seems okay at the beginning, along with a big fat log message that explains the situation to an admin, and tells them to complain to us.

Yeah, I'm betting the clients don't support such a feature. However, if we start rejecting all resends of parts, that means we're choosing performance over more coverage of the protocol. Some users may prefer having resend capability over improved performance.

Having an option in the config for controlling rejection of parts would cover both use-cases.