Open chrisspen opened 7 years ago
I am also having this issue.
s3cmd sync --requester-pays --skip-existing --no-check-md5 s3://mybucket/myprefix .
I expected s3cmd to ignore existing files, and continue syncing at file 51/100.
s3cmd does not ignore existing files in the download folder. It starts from the very first file, re-downloading and and overwriting existing files.
pip
Thank you for your reports both of you, that helped me narrow down the 2 cases that produce such an issue.
I will look for a fix for these issues.
Any updates on this? Unfortunately s4cmd doesn't appear to support this either. :(
Still getting this issue with --skip-existing
between two buckets
This only happens between two buckets?
In my case, sync from local storage to bucket, --skip-existing looks like default option. Which means, existing files are skiped automatically, so now I'm searching for opposit option to 'overwirte existing files'.
So, any clear explanations?
Also seeing this issue. Specifying --skip-existing
and --no-check-md5
has no effect:
INFO: Running stat() and reading/calculating MD5 values on X files, this may take some time...
Bug still exists in 2024 with s3cmd version 2.4.0.
I am syncing an entire bucket from AWS S3 to local:
s3cmd sync --skip-existing --no-check-md5 s3://my-bucket /local/path/to/my-bucket
If I interrupt and restart, all existing files are downloaded again, every time.
After more research (i.e. comparing to a different sync utility, rclone) it appears that s3cmd is failing to account for the local system's timezone when setting the modification time.
For example, the AWS S3 website shows the file's modification time as: April 23, 2014, 21:15:52 (UTC-07:00)
That's a PDT display because I'm in California. The UTC equivalent is: 2014-04-24 04:15:52 UTC
When I copy that file with s3cmd sync, the resulting local file has this modification time: 2014-04-24 04:15:52 PDT
So my guess is s3cmd is using a UTC datetime string to set the local mod time, which the local system (MacOS in my case) is interpreting as being in its own timezone. Which ends up being the completely wrong time, which besides causing this particular sync issue is a bad thing in and of itself.
It should probably set mod times with unix timestamp instead?
I'm trying to sync two S3 buckets with the command:
and every time I run it, it reports:
Even if I let it complete, and then re-run it, it reports that it again has to transfer 1598 files all over again, indicating that it's ignoring either the
--no-check-md5
option or--skip-existing
option.The expected behavior, especially if
--skip-existing
is specified, is for s3cmd to completely ignore the files its already transferred. To not do this for buckets with a large number of files means that s3cmd wastes a huge amount of time re-transferring files unnecessarily. If the transfer completes, and I re-run it, with no changes having been made to any of the files, I would expect s3cmd to report "0 source files to copy".