Closed rmlopes closed 12 months ago
@rmlopes Could you elaborate, please? Feels like you are mixing up run-cache with external outputs. The latter one is not supported on Azure. You can still push local run-cache to any remote (including azure) though.
If you indeed meant run-cache for external outputs scenario, then run-cache doesn't support it and there are no plans to add it until we redesign external scenarios in general, as right now it is highly experimental and we don't recommend using it usually.
@efiop sure, indeed there seems to be some confusion on my side. I think I mean run-cache correctly though, as it states in the docs "setting an external cache location" so that we don't have to do a commit for every experiment (I have setup remote storage for data, I intended to do the same for the run-cache, and I did come across the external outputs section and noticed the disclaimer but for now we can live without it). In the documentation it states that for having the run-cache in external storage you add a remote and then configure cache, such as cache.s3
or cache.gs
but azure is not mentioned.
Does this actually clarify my doubt?
In the documentation it states that for having the run-cache in external storage you add a remote and then configure cache, such as cache.s3 or cache.gs but azure is not mentioned.
Could you point out the doc that says that, please?
During normal workflow when you dvc push/pull
data to/from a remote, you could specify --run-cache
option that will also transfer the run-cache (and use --pull in dvc repro
to automatically try to pull the results according to run-cache). If you are using a shared cache dir (dvc config cache.dir /path/to/dir
), run-cache will be shared automatically between everyone using that cache dir, no extra actions needed.
So I think you got confused by the "external data management" doc, as it has nothing to do with the run-cache really. It would be great if you could point out particular docs that confused you.
The quote is from the config cache docs. But it seems that I got there from the "external data management" doc as you said. If I understood you correctly when we use the --run-cache
flag on a push it will add the run-cache to the remote blob (the same that is used for data and models if we are not using a registry repository). A person or agent pulling afterwards can choose to include or not that run-cache as well.
I am still trying to put it all together for our specific use case but I think what I actually am looking for is more of an external dependency. Still not sure. We need to manually label some videos, these and the corresponding annotations would be stored in a cloud storage account. I want to be able to run the CI/CD without having to download the data (or at least download video by video).
I have setup two repositories, one is a data registry (data-registry), the other one is a project using data/models from that data registry (mlops-dvc), both using azure blobs as remote. I have properly configured the data-registry and added files tracked by DVC.
Inside the data-registry project I can do dvc list
, however not with the -R
option, as in this case I will have auth error:
ERROR: failed to list 'ssh://user@company-repo/mlops-dvc-registry.git' - Authentication to Azure Blob Storage via None failed.
Inside the mlops-dvc project I can list the registry (with the same caveat) but I cannot import
it as it will output the same connection error as posted above. I want to have different remotes (both Azure but that should be irrelevant) but even if I configure exactly the same remote for both projects I still get the connection error.
What am I missing here?
The quote is from the config cache docs. But it seems that I got there from the "external data management" doc as you said. If I understood you correctly when we use the
--run-cache
flag on a push it will add the run-cache to the remote blob (the same that is used for data and models if we are not using a registry repository). A person or agent pulling afterwards can choose to include or not that run-cache as well.I am still trying to put it all together for our specific use case but I think what I actually am looking for is more of an external dependency. Still not sure. We need to manually label some videos, these and the corresponding annotations would be stored in a cloud storage account. I want to be able to run the CI/CD without having to download the data (or at least download video by video).
Seems like you are talking about a labeling scenario, which we don't have a native support for yet, but we are looking into it right now. CC @volkfox
You could use external dependencies with dvc run -d
though. E.g. dvc run -d azure://bucket/path ...
or dvc run -d remote://myremote/path
. Those don't require external cache.
Inside the data-registry project I can do dvc list, however not with the -R option, as in this case I will have auth error:
Could you add -v
and post a verbose log, please? Also, please post dvc doctor
.
Sure.
(base) ➜ mlops-temp git:(master) dvc list ssh://user@bitbucket-repo/mlops-temp.git -v
2021-04-30 23:05:30,686 DEBUG: Creating external repo ssh://user@bitbucket-repo/mlops-temp.git@None
2021-04-30 23:05:30,688 DEBUG: erepo: git clone 'ssh://user@bitbucket-repo/mlops-temp.git' to a temporary dir
.dvcignore
data
(base) ➜ mlops-temp git:(master) dvc list -R ssh://user@bitbucket-repo/mlops-temp.git -v
2021-04-30 23:03:35,061 DEBUG: Creating external repo ssh://user@bitbucket-repo/mlops-temp.git@None
2021-04-30 23:03:35,062 DEBUG: erepo: git clone 'ssh://user@bitbucket-repo/mlops-temp.git' to a temporary dir
2021-04-30 23:03:36,318 DEBUG: Preparing to download data from 'azure://temp/temp'
2021-04-30 23:03:36,318 DEBUG: Preparing to collect status from azure://temp/temp
2021-04-30 23:03:36,319 DEBUG: Collecting information from local cache...
2021-04-30 23:03:36,319 DEBUG: Collecting information from remote cache...
2021-04-30 23:03:36,320 DEBUG: Matched '0' indexed hashes
2021-04-30 23:03:36,320 DEBUG: Querying 1 hashes via object_exists
2021-04-30 23:03:36,321 ERROR: failed to list 'ssh://user@bitbucket-repo/mlops-temp.git' - Authentication to Azure Blob Storage via None failed.
Learn more about configuration settings at <https://man.dvc.org/remote/modify>: unable to connect to account for Must provide either a connection_string or account_name with credentials!!
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/adlfs/spec.py", line 505, in do_connect
raise ValueError(
ValueError: Must provide either a connection_string or account_name with credentials!!
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/azure.py", line 123, in fs
file_system = AzureBlobFileSystem(**self.fs_args)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/fsspec/spec.py", line 69, in __call__
obj = super().__call__(*args, **kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/adlfs/spec.py", line 411, in __init__
self.do_connect()
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/adlfs/spec.py", line 510, in do_connect
raise ValueError(f"unable to connect to account for {e}")
ValueError: unable to connect to account for Must provide either a connection_string or account_name with credentials!!
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/command/ls/__init__.py", line 30, in run
entries = Repo.ls(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/repo/ls.py", line 38, in ls
ret = _ls(repo.repo_fs, path_info, recursive, dvc_only)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/repo/ls.py", line 57, in _ls
for root, dirs, files in fs.walk(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/repo.py", line 367, in walk
yield from self._walk(repo_walk, dvc_walk, dvcfiles=dvcfiles)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/repo.py", line 303, in _walk
yield from self._walk(repo_walk, dvc_walk, dvcfiles=dvcfiles)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/repo.py", line 305, in _walk
yield from self._dvc_walk(dvc_walk)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/repo.py", line 231, in _dvc_walk
root, dirs, files = next(walk)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/dvc.py", line 195, in walk
yield from self._walk(root, trie, topdown=topdown, **kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/dvc.py", line 169, in _walk
yield from self._walk(root / dname, trie)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/dvc.py", line 169, in _walk
yield from self._walk(root / dname, trie)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/dvc.py", line 150, in _walk
self._add_dir(trie, out, **kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/dvc.py", line 138, in _add_dir
self._fetch_dir(out, **kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/dvc.py", line 132, in _fetch_dir
out.get_dir_cache(**kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/output/base.py", line 498, in get_dir_cache
self.repo.cloud.pull(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/data_cloud.py", line 88, in pull
return remote.pull(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/remote/base.py", line 56, in wrapper
return f(obj, *args, **kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/remote/base.py", line 486, in pull
ret = self._process(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/remote/base.py", line 323, in _process
dir_status, file_status, dir_contents = self._status(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/remote/base.py", line 175, in _status
self.hashes_exist(
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/remote/base.py", line 132, in hashes_exist
return indexed_hashes + self.odb.hashes_exist(list(hashes), **kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/objects/db/base.py", line 380, in hashes_exist
remote_hashes = self.list_hashes_exists(hashes, jobs, name)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/objects/db/base.py", line 338, in list_hashes_exists
ret = list(itertools.compress(hashes, in_remote))
File "/usr/local/Cellar/python@3.9/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 608, in result_iterator
yield fs.pop().result()
File "/usr/local/Cellar/python@3.9/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 445, in result
return self.__get_result()
File "/usr/local/Cellar/python@3.9/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
File "/usr/local/Cellar/python@3.9/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/objects/db/base.py", line 329, in exists_with_progress
ret = self.fs.exists(path_info)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/fsspec_wrapper.py", line 94, in exists
return self.fs.exists(self._with_bucket(path_info))
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/funcy/objects.py", line 50, in __get__
return prop.__get__(instance, type)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
File "/usr/local/Cellar/dvc/2.0.18/libexec/lib/python3.9/site-packages/dvc/fs/azure.py", line 129, in fs
raise AzureAuthError(
dvc.fs.azure.AzureAuthError: Authentication to Azure Blob Storage via None failed.
Learn more about configuration settings at <https://man.dvc.org/remote/modify>
------------------------------------------------------------
2021-04-30 23:03:36,335 DEBUG: Analytics is enabled.
2021-04-30 23:03:36,562 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/hr/rkq97pts6hl9t_qmfb1l7l880000gp/T/tmp6blhdltn']'
2021-04-30 23:03:36,565 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/hr/rkq97pts6hl9t_qmfb1l7l880000gp/T/tmp6blhdltn']'
(base) ➜ mlops-temp git:(master) dvc doctor
DVC version: 2.0.18 (brew)
---------------------------------
Platform: Python 3.9.4 on macOS-11.2.3-x86_64-i386-64bit
Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1s1
Caches: local
Remotes: azure
Workspace directory: apfs on /dev/disk1s1s1
Repo: dvc, git
@rmlopes did you set any configuration variables for the azure (via dvc remote modify
)? Such as a connection string or account_name+account_key combination?
@isidentical I am using connection string and I have tried with both repos (the registry and the main repo) having the exact same config for the remote (including the connection string). Note that in the output above I am only doing it from the data registry side and the non-recursive list works as expected.
Hi,
I would like to setup an external Azure remote for the run-cache. It is not mentioned in the docs how to configure it (similarly to what is done with S3 or GCloud). Is this not supported yet?