Open tatobi opened 10 years ago
This behavior is known. The reason why the sync
command behaves this way is that s3 does not physically use directories. There are only buckets and objects. Objects have prefixes that act like directories, but s3 does not designate a specific physical object to be a directory.
Therefore, when the syncing occurs, only files are transferred to s3 because s3 does not have physical directories. So when you try to sync up empty directories, nothing is uploaded because there are no files in them. Once you put items in the directory, then the file (with the prefix representing the directory) will be uploaded.
Thank you Kyle, it is clear. I know how S3 stores files, but sometimes we need the same directory structure in sevaral places even if there are empty ones or remove from if we do not need anymore. A good example if you have complex directory structure with a lot of contents locally than you synced to S3. After that an automated mechanism sync this structure periodically to several running instances. You keep up-to date (delete) most of the content from S3 then the automatism re-sync to the places where you used before. Unfortunately you will find the original complex directory structure remains forever on sync targets which may cause confusion if you want to check it or your program try to use this empty folders because of you need always the same everywhere. Moreover the people who use it with --delete options maybe used the "rsync" equivalent before on Linux which keeps the folders synced so counts on the same operation. I think it would be not hard to implement a switch or option for aws tool to detect somehow if an S3 object is a file or folder (list, size, etc..) and create/delete them locally or in an S3 bucket (e.g. list(bucket.list("", "/"))?
That makes sense. Will look into adding a feature for it.
This would be very useful for our situation as well. If it were added as an option (--sync-empty-directories) people could choose to use it when needed.
+1 Need this feature very badly
+1. Would like to use it.
+1
I also was surprised by this behavior, given that it is called "sync". I can work around this in my particular use case, but future users could be spared the pain :)
+1 on being able to sync directory structure! If you delete a folder it only removes the content, but it leaves the folder behind...
+1. I have the same needs.
+1 - surprised that hasn't been implemented yet. Sure, in my case it doesn't matter too much, and I can work around it (or just use placeholder files when creating structures), but it would be a benefit to just have it supported by either s3 sync or s3 cp.
+1
s3cmd sync
does keep the folder structure but therefore it has some issues when granting access while synching so one needs to run another s3cmd setacl --recursive
afterwards…
+1
+1
+1
Thanks for the feedback everyone. I think the best option I've seen is to add a --sync-empty-directories
option. Let's do that.
@jamesls I'm expecting somewhat like rsync functionalities, but s3 as an object storage is definitely not the same though.
+1
+1
Any timeline for this feature?
As a temporary workaround I added an empty .s3keep file to the empty directories and it works for me. This is a hack I usually use to trick git to not treat empty directories as empty ones :)
Will this also allow to "remove/delete" empty directories on S3 ?
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
Makes lot of sense during data migrations to s3.
+1
+1 Just got smashed by this... Arg....
+1
+10 It's possible to work around this with dummy files but it would be cleaner if there would be an option to force an empty prefix to synchronize.
+1. Use case: backing up an svn repository.
More generally:
aws s3 sync thing
I expected thing_copy to match thing exactly.
+1
+1
+1
+1 need to delete empty directories
How's the progress of adding this option --sync-empty-directories
?
any feedback from AWS Team?
Thanks.
+1 would be a very useful feature for a very useful tool
+1
The aws s3 sync does not fully synchronize the S3 folder structure locally even if I use it with --delete or --recursive arguments:
aws --version aws-cli/1.4.3 Python/2.7.6 Linux/3.13.0-35-generic
$ aws s3 ls s3://s3.testbucket $ aws s3 ls s3://s3.testbucket/ $ mkdir s3.testfolder $ mkdir s3.testfolder/test1 $ aws s3 sync ./s3.testfolder s3://s3.testbucket/ $ aws s3 ls s3://s3.testbucket/ $ touch s3.testfolder/test1/1 $ aws s3 sync ./s3.testfolder/ s3://s3.testbucket/ upload: s3.testfolder/test1/1 to s3://s3.testbucket/test1/1 $ aws s3 sync ./s3.testfolder s3://s3.testbucket/ $ mkdir ./s3.testfolder/test-to-delete $ aws s3 sync s3://s3.testbucket/ ./s3.testfolder/ --delete --recursive $ aws s3 sync s3://s3.testbucket/ ./s3.testfolder/ --delete $ ls -lah ./s3.testfolder/ total 60K drwxrwxr-x 4 tobi tobi 4,0K szept 12 15:24 . drwx------ 71 tobi tobi 44K szept 12 15:22 .. drwxrwxr-x 2 tobi tobi 4,0K szept 12 15:23 test1 drwxrwxr-x 2 tobi tobi 4,0K szept 12 15:24 test-to-delete
$ aws s3 ls s3://s3.testbucket/ PRE test1/