Open alexjurkiewicz opened 7 years ago
Marking as a feature request. The tricky part if we did it in cp
or mv
is that the CLI may have to query S3 to see if the file exists before trying to upload it. So it may make more sense to add it to sync
as it already does that.
I'd like to see this in cp
and/or mv
as well.
The reason I don't use sync
for this right now is that sync
has major performance problems if the destination bucket has a lot of existing files under the target directory.
When you run aws s3 cp --recursive newdir s3://bucket/parentdir/
, it only visits each of the files it's actually copying.
When you run aws s3 sync newdir s3://bucket/parentdir/
, it visits the files it's copying, but also walks the entire list of files in s3://bucket/parentdir
(which may already contain thousands or millions of files) and gets metadata for each existing file.
On a sufficiently large destination bucket, aws s3 cp --recursive
can take seconds and aws s3 sync
can take hours to copy the same data.
Obviously fixing sync
would be nice, but if adding a "check to see if the file already exists" query to cp
is a more tractable problem than revamping the sync
code to make it fast, it might make sense to do that instead.
I'm also very interested in this feature. An optional interactive prompt for overwriting files would also be nice to have.
yes @sgrimm-sg, it makes sense. I am also interested to see CLI cp command which can actually handle these conditions.
It would be extremely useful for this to be an option on aws s3 sync
. rsync
has this functionality available as --ignore-existing
. My preference would be try to use the same option names as rsync
as I suspect there are a lot of folks already familiar with rsync
.
Good Morning!
We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.
This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.
As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.
We’ve imported existing feature requests from GitHub - Search for this issue there!
And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue.
GitHub will remain the channel for reporting bugs.
Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface
-The AWS SDKs & Tools Team
This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168406-add-no-overwrite-option-to-aws-s3-cp-mv
Related:
Based on community feedback, we have decided to return feature requests to GitHub issues.
@jamesls that's great! can you please respond to the suggest at hand? --no-overwrite would be a great addition and it will avoid wrapping the calls with scripts
+1 to this issue. I propose -n, --no-clobber to match existing Linux cp command options.
Has there been any implementation of this request? Trying to work with Windows batch files to do local backup > S3, this is the easiest methods, a simple no-overwrite or similar flag.
Any update regarding this feature?
Any update regarding this feature ? Thanks
Any update regarding this feature?
Any update regarding this feature?
sev3, +1
Really need this feature added as S3 sync does not seem to upload every file.
Any updates or workarounds?
Any updates or workarounds?
I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.
I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.
@southpaw5271 - care to share your script and save me some time ? ; )
I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.
@southpaw5271 - care to share your script and save me some time ? ; )
I don't seem to have it anymore :( Sorry!
This flag would also be valuable for the cp
command, since sync
does not allow to copy a file while changing the destination name.
aws s3 cp --no-overwrite ./somefile s3://bucket/othername
We also need the --no-overwrite option from s3 to local. We've been burned by accidental overwrites from well-meaning individuals, and this would be a very much appreciated way to put up a "guardrail" for them. Thanks!
any update?
Any update regarding this feature? Thank you
+1
+1
I'm migrating files from old system. I have a python script that generates different paths according to db columns (ie each org has now its own folder on specific bucket), so I cant rely on sync command. I could run a script to do partial migrations if a -skip-duplicate or some sort of parameter would be available.
+1
This is badly needed. Why is it not already there?
+1
+10
Any update regarding this feature? Thank you
Also casting a vote for this feature please!
+1
+1
+1
++1
any updates on this?
+1
I can't believe it is still not a thing. It is next to impossible to create an immutable storage on S3 :(
We have initiated copied data from S3 bucket to local linux mount and the size at destination is ~40 GB less than S3 Bucket. Find ways to copy only the missing. It seems there is no easy way to do it instead of copying all..
One option of aws s3 sync
that may accomplish something similar is the --size-only
boolean flag, which will override in destination only if the size differs between the source and the destination object,
--size-only (boolean) Makes the size of each key the only criteria used
to decide whether to sync from source to destination.
The command I used,
aws s3 sync s3://<source bucket> s3://<destination bucket> --size-only
Important: Strongly suggest to run command above with the --dry-run
boolean flag first, to see what this command intends to do before actually running it, especially if it involves your production systems. This is out of an abundance of caution.
does not help me, waiting for #5456 to be addressed by DH
We need this feature too. Just in case it help, we are doing on bash script something like:
version=$(node -e "console.log(require('./package.json').version);")
result=$(aws s3 ls somebucketname/${version}/something.zip)
if [ "$result" == "" ] || [ "$1" == "--force" ]; then
aws s3 cp ./deploy/something.zip s3://somebucketname/${version}/something.zip
else
echo "ERROR: Already exists ${version}"
fi
cc @matiasagt
Any update regarding this feature?
my application's api would like to know if the put object operation is overwriting some file, that helps. It can only be done by you guys by implementing a mutex
it's been years I am begging you to add it
Work ongoing on PR https://github.com/aws/aws-cli/pull/6095.
@kdaily - any news on this?
@jfstephe, unfortunately not. I'll keep checking!
It would be nice to have a convenience function
--no-overwrite
foraws s3 cp/mv
commands, which would check the target destination doesn't already exist before putting a file into an s3 bucket.Of course this logic couldn't be guaranteed by the AWS API (afaik...) and is vulnerable to race conditions, etc. But it would be helpful to prevent unintentional mistakes!