Closed phrak closed 3 years ago
Thanks @phrak for reaching out! We are aware of this scenario and it is being tracked here. It does sound like a valid scenario and we hope to support it soon.
To clarify, the blob service does not support storing an exact lmt; the time when the blob is created/modified is automatically the lmt. To avoid re-transferring the same files in a download situation, we can first preserve the source lmt locally (which needs to be added) in previous run, and then compare the lmts during sync and transfer the source files if the corresponding local lmt is different. However, in a copy situation (blob -> blob), this does not work since we cannot preserve the lmts of blobs and rely on it as an accurate indicator of the file's "version".
Thus, the best approach (that we can think of right now) for the mirroring scenario is to simply copy over all the source files, and delete all extra files at the destination.
Please let us know if you have any other thoughts about this.
Hi @zezha-msft , thanks for replying. I hope a solution can be found soon too.
An option to force-overwrite the destination only if it's different from the source would be very useful to avoid re-copying identical files again.
Another thought - Would it be possible to use the MD5 hash property to compare the objects and overwrite if different?
In the mean-time, it sounds like the best work-around is to use azcopy sync
to sync the blob to a safe, dedicated staging location, then use robocopy
to properly mirror the staging directory Source -> final Destination.
Hi @phrak, the MD5 value can be stored on the service side, but it is not validated against the data, so it'd be hard to use MD5 values as an indication of file content, since they could easily become out of date if the user changed the content but didn't update the MD5 value. More often, there could be files that got uploaded without a stored MD5.
Sorry for the inconvenience. We are aware of this problem, and will look into solving it with a good UX.
Which version of the AzCopy was used?
10.7.0
Which platform are you using? (ex: Windows, Mac, Linux)
Windows
What command did you run?
azcopy sync [https://blob URL] [C:\LocalPath] --delete-destination
What problem was encountered?
Unable to truly sync a source -> target. Files modified in the destination are ignored without any notification. They are not overwritten, copied, updated or logged as being different. Business impact being that we cannot ensure the source matches the target, nor can we provide a list of files that are different.
How can we reproduce the problem in the simplest way?
azcopy sync
the blob to the local folder.azcopy sync
command.Have you found a mitigation/solution?
Only semi-viable work-around is to run an
azcopy copy [source] [destination] --overwrite:true
command to re-copy the entire source directory to the target, however this overwrites all files, not just differences, causing issues with very large file sets. Theazcopy copy [source] [destination] --overwrite=ifSourceNewer
switch is not viable because the Destination is newer than the Source.Other potential work-around is to
azcopy sync
to a safe staging location, then userobocopy
to truly mirror the staging Source -> final Destination.The AZCopy Sync Wiki Page describes this behaviour as being by design - Unfortunately this leaves us with very few options to properly sync a folder pair.