peak / s5cmd

Parallel S3 and local filesystem execution tool.
MIT License
2.55k stars 223 forks source link

sync command --stat does not report correct rm command quantity #649

Open arriterx opened 1 year ago

arriterx commented 1 year ago

immagine

There were 133 rm commands executed but stats shows only 1 as you might see. The same does not happen with cp command (it correctly shows 133 cp commands executed).

ahmethakanbesel commented 1 year ago

Hi,

s5cmd uses the Multi-Object Delete API of S3 to delete multiple objects (up to 1000 per request) in a single request. So your all files deleted in a single request but it should be possible to show the number of deleted objects. Thanks for your suggestion.

arriterx commented 1 year ago

Hey @ahmethakanbesel ,

yes, but the sync direction was s3 => local, so multiple rm operations were performed on local side, not on S3 side.

BEFORE the sync (S3 => local) : -local FILE1 FILE2 FILE50 FILE60 FILE100

-remote S3 FILE1 FILE50

AFTER the sync (with --delete): -local FILE1 FILE50

-remote S3 FILE1 FILE50

There were 3 rm operations performed on local side, not 1

Best regards

LeafmanZ commented 1 year ago

Is there a way to do s3 to s3 sync?

arriterx commented 1 year ago

Is there a way to do s3 to s3 sync?

With s5cmd, no. The only way is to download s3 old => local, and upload local => s3 new. Or if possible mount s3 old storage as a network drive (there are some utilities outthere like s3fs etc) and then sync local => s3 new, but local will be actually and s3 old mounted drive. I made some s3 to s3 migrations like this.

ahmethakanbesel commented 1 year ago

Hey @ahmethakanbesel ,

yes, but the sync direction was s3 => local, so multiple rm operations were performed on local side, not on S3 side.

BEFORE the sync (S3 => local) : -local FILE1 FILE2 FILE50 FILE60 FILE100

-remote S3 FILE1 FILE50

AFTER the sync (with --delete): -local FILE1 FILE50

-remote S3 FILE1 FILE50

There were 3 rm operations performed on local side, not 1

Best regards

The client.MultiDelete function is called only once, regardless of whether the deletion is local or remote. This is why you only see one rm command being executed. We may consider showing the exact number of deleted objects. Thanks for your feedback.

ilkinulas commented 1 month ago

Is there a way to do s3 to s3 sync?

With s5cmd, no. The only way is to download s3 old => local, and upload local => s3 new. Or if possible mount s3 old storage as a network drive (there are some utilities outthere like s3fs etc) and then sync local => s3 new, but local will be actually and s3 old mounted drive. I made some s3 to s3 migrations like this.

s5cmd can sync between s3 buckets. Here is an example from the s5cmd sync --help output:

  03. Sync S3 bucket objects under prefix to S3 bucket.
     > s5cmd sync "s3://sourcebucket/prefix/*" s3://destbucket/