aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.58k stars 4.14k forks source link

Add --no-overwrite option to aws s3 cp/mv #2874

Open alexjurkiewicz opened 7 years ago

alexjurkiewicz commented 7 years ago

It would be nice to have a convenience function --no-overwrite for aws s3 cp/mv commands, which would check the target destination doesn't already exist before putting a file into an s3 bucket.

Of course this logic couldn't be guaranteed by the AWS API (afaik...) and is vulnerable to race conditions, etc. But it would be helpful to prevent unintentional mistakes!

kyleknap commented 7 years ago

Marking as a feature request. The tricky part if we did it in cp or mv is that the CLI may have to query S3 to see if the file exists before trying to upload it. So it may make more sense to add it to sync as it already does that.

sgrimm-sg commented 7 years ago

I'd like to see this in cp and/or mv as well.

The reason I don't use sync for this right now is that sync has major performance problems if the destination bucket has a lot of existing files under the target directory.

When you run aws s3 cp --recursive newdir s3://bucket/parentdir/, it only visits each of the files it's actually copying.

When you run aws s3 sync newdir s3://bucket/parentdir/, it visits the files it's copying, but also walks the entire list of files in s3://bucket/parentdir (which may already contain thousands or millions of files) and gets metadata for each existing file.

On a sufficiently large destination bucket, aws s3 cp --recursive can take seconds and aws s3 sync can take hours to copy the same data.

Obviously fixing sync would be nice, but if adding a "check to see if the file already exists" query to cp is a more tractable problem than revamping the sync code to make it fast, it might make sense to do that instead.

slavaaaaaaaaaa commented 7 years ago

I'm also very interested in this feature. An optional interactive prompt for overwriting files would also be nice to have.

shabeebk commented 6 years ago

yes @sgrimm-sg, it makes sense. I am also interested to see CLI cp command which can actually handle these conditions.

jhoblitt commented 6 years ago

It would be extremely useful for this to be an option on aws s3 sync. rsync has this functionality available as --ignore-existing. My preference would be try to use the same option names as rsync as I suspect there are a lot of folks already familiar with rsync.

ASayre commented 6 years ago

Good Morning!

We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.

This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.

As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.

We’ve imported existing feature requests from GitHub - Search for this issue there!

And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue.

GitHub will remain the channel for reporting bugs.

Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface

-The AWS SDKs & Tools Team

This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168406-add-no-overwrite-option-to-aws-s3-cp-mv

kenorb commented 6 years ago

Related:

jamesls commented 6 years ago

Based on community feedback, we have decided to return feature requests to GitHub issues.

guyisra commented 6 years ago

@jamesls that's great! can you please respond to the suggest at hand? --no-overwrite would be a great addition and it will avoid wrapping the calls with scripts

evanstucker-hates-2fa commented 5 years ago

+1 to this issue. I propose -n, --no-clobber to match existing Linux cp command options.

CaptainPalapa commented 5 years ago

Has there been any implementation of this request? Trying to work with Windows batch files to do local backup > S3, this is the easiest methods, a simple no-overwrite or similar flag.

adiii717 commented 5 years ago

Any update regarding this feature?

julio75012 commented 5 years ago

Any update regarding this feature ? Thanks

noelnamai commented 5 years ago

Any update regarding this feature?

avanier commented 5 years ago

Any update regarding this feature?

pgolebiowski commented 5 years ago

sev3, +1

southpaw5271 commented 5 years ago

Really need this feature added as S3 sync does not seem to upload every file.

mehmetfazil commented 5 years ago

Any updates or workarounds?

southpaw5271 commented 5 years ago

Any updates or workarounds?

I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.

kevb commented 5 years ago

I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.

@southpaw5271 - care to share your script and save me some time ? ; )

southpaw5271 commented 5 years ago

I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.

@southpaw5271 - care to share your script and save me some time ? ; )

I don't seem to have it anymore :( Sorry!

mpdude commented 5 years ago

This flag would also be valuable for the cp command, since sync does not allow to copy a file while changing the destination name.

aws s3 cp --no-overwrite ./somefile s3://bucket/othername

RobElEmYew commented 5 years ago

We also need the --no-overwrite option from s3 to local. We've been burned by accidental overwrites from well-meaning individuals, and this would be a very much appreciated way to put up a "guardrail" for them. Thanks!

EralpB commented 4 years ago

any update?

cbachhuber commented 4 years ago

Any update regarding this feature? Thank you

danielwhatmuff commented 4 years ago

+1

dsovino-alma commented 4 years ago

+1

I'm migrating files from old system. I have a python script that generates different paths according to db columns (ie each org has now its own folder on specific bucket), so I cant rely on sync command. I could run a script to do partial migrations if a -skip-duplicate or some sort of parameter would be available.

plastic-abubakr commented 4 years ago

+1

rjurney commented 4 years ago

This is badly needed. Why is it not already there?

ghost commented 4 years ago

+1

curioustolearn commented 4 years ago

+10

jmvictoria commented 4 years ago

Any update regarding this feature? Thank you

BobPusateri commented 4 years ago

Also casting a vote for this feature please!

zanieb commented 4 years ago

+1

7404020 commented 4 years ago

+1

Jongy commented 4 years ago

+1

yuriw commented 4 years ago

++1

ssbagalkar commented 4 years ago

any updates on this?

alexandredavi commented 4 years ago

+1

molszanski commented 4 years ago

I can't believe it is still not a thing. It is next to impossible to create an immutable storage on S3 :(

sreekanthadari commented 4 years ago

We have initiated copied data from S3 bucket to local linux mount and the size at destination is ~40 GB less than S3 Bucket. Find ways to copy only the missing. It seems there is no easy way to do it instead of copying all..

joquijada commented 4 years ago

One option of aws s3 sync that may accomplish something similar is the --size-only boolean flag, which will override in destination only if the size differs between the source and the destination object,

 --size-only (boolean) Makes the size of each key the only criteria used
       to decide whether to sync from source to destination.

The command I used,

aws s3 sync s3://<source bucket> s3://<destination bucket> --size-only 

Important: Strongly suggest to run command above with the --dry-run boolean flag first, to see what this command intends to do before actually running it, especially if it involves your production systems. This is out of an abundance of caution.

yuriw commented 4 years ago

does not help me, waiting for #5456 to be addressed by DH

jbaris commented 3 years ago

We need this feature too. Just in case it help, we are doing on bash script something like:

version=$(node -e "console.log(require('./package.json').version);")
result=$(aws s3 ls somebucketname/${version}/something.zip)

if [ "$result" == "" ] || [ "$1" == "--force" ]; then
    aws s3 cp ./deploy/something.zip s3://somebucketname/${version}/something.zip
else
    echo "ERROR: Already exists ${version}"
fi

cc @matiasagt

Xyncgas commented 3 years ago

Any update regarding this feature?

Xyncgas commented 3 years ago

my application's api would like to know if the put object operation is overwriting some file, that helps. It can only be done by you guys by implementing a mutex

Xyncgas commented 3 years ago

it's been years I am begging you to add it

kdaily commented 3 years ago

Work ongoing on PR https://github.com/aws/aws-cli/pull/6095.

jfstephe commented 3 years ago

@kdaily - any news on this?

kdaily commented 3 years ago

@jfstephe, unfortunately not. I'll keep checking!