Open matevzg opened 4 years ago
This also does not work: azcopy sync "V:\Folder1" "https://storageaccount.blob.core.windows.net/folder1/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rwdlacup&se=2020-01-01T16:07:51Z&st=2019-01-01T08:07:51Z&spr=https&sig=SIG" --exclude-path="f1;f2 f3;f4"
Nor does this: azcopy sync "V:\Folder1" "https://storageaccount.blob.core.windows.net/folder1/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rwdlacup&se=2020-01-01T16:07:51Z&st=2019-01-01T08:07:51Z&spr=https&sig=SIG" --exclude-path=f1
Or this: azcopy sync "V:\Folder1" "https://storageaccount.blob.core.windows.net/folder1/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rwdlacup&se=2020-01-01T16:07:51Z&st=2019-01-01T08:07:51Z&spr=https&sig=SIG" --exclude-path="f1"
Local structure: "V:\Folder1" "V:\Folder1\f1" "V:\Folder1\f1..." "V:\Folder1\f2 f3" "V:\Folder1\f2 f3..." "V:\Folder1\f4" "V:\Folder1\f4"...
Always scans complete tree structure. Strange, this is. ¯_(ツ)_/¯
@zezha-msft Did you see this one?
Hi @matevzg, sorry for the delayed reply.
Unfortunately, I wasn't able to repro this on my end, the exclude-path
flag is working as expected.
Could you please clarify the observed behavior? Were the excluded folders still getting synced? They always get scanned, but they shouldn't be replicated to the destination.
Hi @matevzg,
I can see some syntax issue. Could you please try the below syntax: "https://storageaccount.blob.core.windows.net/folder1/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rwdlacup&se=2020-01-01T16:07:51Z&st=2019-01-01T08:07:51Z&spr=https&sig=SIG" --exclude-path f1;f2 f3;f4
Command Executed on Linux: azcopy copy "/mnt/" "https://prodneuazcopyst.blob.core.windows.net/xxxxx/[SAS] --recursive=true --follow-symlinks=false --exclude-path /mnt/.snapshot/
Still scanning the .snapshot folder.
I think it will work as desired if you remove the trailing / from the exclude parameter. (Maybe we should automatically remove those).
I think that what you've used is being interpreted by the tool as "don't scan any directories inside the snapshot folder".
Tried, --exclude-path "/mnt/.snapshot" and "--exclude-path=/mnt/.snapshot" - Both are not working. exclude path not working in azcopy.
Thanks for the test results @ramuadapa
Depending on whether we can reproduce this, and how it gets triaged if we do, we might be able to fix this in release 10.4. Maybe...
@ramuadapa just a thought, but could you try not having the root folder on that path?
ex. --exclude-path=.snapshot
IIRC we check for a prefix on a relative path for exclude path.
(that being said though, I've seen users do both ways, so this could arguably be a usability complaint)
Upvote for the relative & absolute path - I just saw this at a customer as well.
(that being said though, I've seen users do both ways, so this could arguably be a usability compl
@ramuadapa just a thought, but could you try not having the root folder on that path?
ex.
--exclude-path=.snapshot
IIRC we check for a prefix on a relative path for exclude path.
Even tried this before updating the blog, with relative path also, we are seeing issues.
@adreed-msft, @zezha-msft , @nakulkar-msft Any thoughts?
./azcopy sync "I:\final\1001\1001001" "container?sv=tocken" --delete-destination=true --include-pattern=".dwg;.pdf" --exclude-path="1/Obsolete;1/Quality;1/Quote;2;3;4"
--exclude-path string Exclude these paths when copying. This option does not support wildcard characters (*). Checks relative path prefix(For example: myFolder;myFolder/subDirName/file.pdf). When used in combination with account traversal, paths do not include the container name.
Just curious if there's been any headway on this issue. I'm running into this same problem but I'm using the copy mode instead of sync.
My source is "H:\BackupRoot\SiteBackup" and and I want to exclude "H:\BackupRoot\SiteBackup\SQLBackup"
I've tried the following combinations with no luck: --exclude-path="SQLBackup" --exclude-path="H:\BackupRoot\SiteBackup\SQLBackup\database.mdf" --exclude-path="SiteBackup\SQLBackup\database.mdf" --exclude-path="SiteBackup\SQLBackup"
No errors, and I see it's interrupted in the Job-Command line of the log. Using AzCopy 10.4.3 x64 on Windows.
I am using the Azcopy v10.5.0 on Windows x64 and it is still not working. I tried using both full path and prefixes with no luck.
We are planning to use Azcopy in a production environment and this feature is urgently needed. I would be grateful if you can fix this issue.
Hi @berguner, can you post the command you ran, and the AzCopy log file? Please make sure you redact any SAS tokens used in the command.
I am trying to exclude a subfolder called "Thumbnail_Images" and I tried:
azcopy.exe sync $relative_folder_path "$blob_container/$folder_name/$sas" --recursive --exclude-path $run_name/Thumbnail_Images
azcopy.exe sync $relative_folder_path "$blob_container/$folder_name$sas" --recursive --exclude-path $run_name/Thumbnail_Images
azcopy.exe sync $relative_folder_path "$blob_container/$folder_name/$sas" --recursive --exclude-path $run_name\Thumbnail_Images
azcopy.exe sync $relative_folder_path "$blob_container/$folder_name$sas" --recursive --exclude-path $run_name\Thumbnail_Images
azcopy.exe sync $relative_folder_path "$blob_container/$folder_name/$sas" --recursive --exclude-path Thumbnail_Images
azcopy.exe sync $relative_folder_path "$blob_container/$folder_name$sas" --recursive --exclude-path Thumbnail_Images
azcopy.exe sync $full_folder_path "$blob_container/$folder_name/$sas" --recursive --exclude-path $run_name/Thumbnail_Images
azcopy.exe sync $full_folder_path "$blob_container/$folder_name$sas" --recursive --exclude-path $run_name/Thumbnail_Images
azcopy.exe sync $full_folder_path "$blob_container/$folder_name/$sas" --recursive --exclude-path $run_name\Thumbnail_Images
azcopy.exe sync $full_folder_path "$blob_container/$folder_name$sas" --recursive --exclude-path $run_name\Thumbnail_Images
azcopy.exe sync $full_folder_path "$blob_container/$folder_name/$sas" --recursive --exclude-path Thumbnail_Images
azcopy.exe sync $full_folder_path "$blob_container/$folder_name$sas" --recursive --exclude-path Thumbnail_Images
And none of the above worked. The log files don't show much because the "Thumbnail_Images" were already uploaded and there is nothing to sync. I can tell that the "Thumbnail_Images" is still being scanned both on the source and destination based on the number of files being scanned.
I am using the cp
command below for the time being but it is not ideal because it only compares the timestamps. In the logs of the cp
command, I can see that the number of scanned files don't include the number of files in the "Thumbnail_Images" folder.
azcopy.exe cp $full_folder_path "$blob_container/$sas" --recursive --overwrite isSourceNewer --exclude-path Thumbnail_Images
@berguner The exclude-path uses relative path, and I'd expect 'azcopy cp src dst --recursive --exclude-path Thumbnail__Images' to work. Can you verify through AzCopy logs that it is not enclosed in quotes when passed to AzCopy as in here: Job-Command copy /home/nakulkar https://myaccount.blob.core.windows.net/container?SAS --exclude-path="NoQuotesHere" --recursive The AzCopy logs are in $HOME/.azcopy. I'll have a look after you post the logs here.
azcopy version 10.9.0
Same issue here. Tried relative path, folder name. All the folders/files excluded are replicated. My structure is as follow in the container:
/site1/App_Data/ClientDependency /site2/App_Data/ClientDependency ...
I would like to exclude all "App_Data/ClientDependency" folders.
azcopy sync 'source' 'destination' --recursive --exclude-path='App_Data/ClientDependency'
> doesn't work
azcopy sync 'source' 'destination' --recursive --exclude-path='/App_Data/ClientDependency'
> doesn't work
azcopy sync 'source' 'destination' --recursive --exclude-path='ClientDependency'
> doesn't work
azcopy sync 'source' 'destination' --recursive --exclude-path='site1/App_Data/ClientDependency'
> works
It looks like the exclude-path must be specified starting from root.
Source= site Destination: site
Folder Structure:-- site/site1/App_Data/ClientDependency site/site2/App_Data/ClientDependency site/site3/App_Data/ClientDependency
Try this command:-- ./azcopy sync Source "https://StorageaccountName.blob.core.windows.net/containerName/site/?sv=A....D" --put-md5 --recursive --exclude-path 'site1/App_Data/ClientDependency;site2/App_Data/ClientDependency'
This will only Sync Site3 Folder to Azure Container.
Thank you for your answer. Yes using relative path it's working fine. My issue is that I have an unknown number of sites and I would like to define a single exclude-path rule which would exclude a folder for all sites.
Wildcards are not supported, exclude-pattern applies only to files and --list-of-files is not supported on sync so I guess my only chance would be to build a powershell script which goes trough the structure and calls the azcopy sync commands on folder I want to sync
To clarify, exclude-path works on relative paths under the given source. And exclude-pattern is for file names only. We'll try to clarify the docs to avoid this confusion.
--exclude-path string Exclude these paths when copying. This option does not support wildcard characters (*). Checks relative path prefix(For example: myFolder;myFolder/subDirName/file.pdf). When used in combination with account traversal, paths do not include the container name.
--exclude-pattern string Exclude these files when copying. This option supports wildcard characters (*).
@gfaessler we understand that there's not enough flexibility here to accommodate scenarios like yours. We were thinking that perhaps providing some kind of include-regex and exclude-regex may help, it'd be used over the entire relative path (under the source root) of each file. Please let us know if you have any feedback about that idea. Thanks.
@zezha-msft Providing include/exclude path regex would definitely be a useful feature to handle this kind of scenario. Otherwise supporting wildcard characters in exclude-path would also do the job in my scenario.
Just tried using AzCopy as Storage Explorer wasn't flexible enough. After 30mins of struggling to make exclude path work I ended up here. My scenario is I have a complex hierarchy several layers deep and at the deepest levels there's a collection of folders and I want to exclude one of those folders (that share a common name) from a sync from the blob to a local drive. I'd hate to have to manually specify each and every folder explicitly to exclude. Something simpler like the glob syntax ( /Folder/ ) or just skip any folder that matches the string in the exclude path would be perfect (and also to expose that in Storage Explorer too eventually too)
Just ran into this issue after struggling with this option.
The documentation would really benefit from the additional text:
--exclude-path must be the full path without the container name
@zezha-msft, who would need to be convinced in order to prioritize to improve this command to have relative paths?
@adreed-msft ^
does --exclude-regex solve this?
exclude-path is not working properly, when will be fixed?
@JohnRusk and @zezha-msft: feel free to contact me internally via teams if you need more details. @JohnRusk has my alias.
I have a consistent repro of this issue on Windows 10/11.
I've recently forced all my PowerShell scripts to use PowerShell Core (pwsh.exe) instead of PowerShell Desktop (powershell.exe) and suddenly I had this problem as well where excluded files and folders are being uploaded through AzCopy.
Interestingly, when I took the command line that pwsh prints out and use it in a cmd.exe or powershell.exe based shell, everything works as expected, excluded files and folders are still excluded. Checking my version history showed that the files started showing up in the storage account after I switched from powershell.exe to pwsh.exe.
Here is the code to reproduce the problem and a simplistic mitigation.
My scripts invoke AzCopy in PowerShell as follows:
& azcopy sync C:\somefolder "https://storageaccount/somefolder?sastoken" --exclude-pattern="web.config"
I can mitigate the problem in pwsh.exe if I pipe the invocation of AzCopy through cmd.exe, like so:
& cmd.exe /c "azcopy sync C:\somefolder `"https://storageaccount/somefolder?sastoken`" --exclude-pattern=`"web.config`""
Maybe others on the thread can also check which shell they are using and possibly corroborate that the difference in behavior is related to the type of shell that is being used.
The versions used for the testing:
Shell | Executable | Version |
---|---|---|
PowerShell Core | pwsh.exe | 7.4.2 |
PowerShell Desktop | powershell.exe | 5.1.19041.4291 |
I can answer my own comment. If you're using PowerShell Core with a version ≥ 7.3 the default behavior for escaping/encapsulating command line arguments has changed! It is described in PSNativeCommandArgumentPassing. The change became mainstream with version 7.3.
The reason calling azcopy.exe
through cmd.exe
is working because on Windows machines running PowerShell Core and cmd.exe
is being invoked through the call operator, the behavior is automatically switched to the legacy mode.
When azcopy.exe
is invoked directly the new behavior is used, which leads to the issues (in my examples above).
f you're using PowerShell Core with a version ≥ 7.3 the default behavior for escaping/encapsulating command line arguments has changed! It is described in PSNativeCommandArgumentPassing. The change became mainstream with version 7.3.
Wow, I did not know that!
Which version of the AzCopy was used?
azcopy 10.3.3
Which platform are you using? (ex: Windows, Mac, Linux)
Windows
What command did you run?
azcopy sync "V:\Folder1" "https://storageaccount.blob.core.windows.net/folder1/?sv=2019-02-02&ss=bfqt&srt=sco&sp=rwdlacup&se=2020-01-01T16:07:51Z&st=2019-01-01T08:07:51Z&spr=https&sig=SIG" --exclude-path="f1" --exclude-path="f2 f3" --exclude-path="f4" --recursive=true --cap-mbps=60 --delete-destination=true --log-level=DEBUG
What problem was encountered?
Folders "f1", "f2 f3", "f4" were not excluded from scanning and syncing.
How can we reproduce the problem in the simplest way?
Retry the upper command.
Have you found a mitigation/solution?
No.