Open hagenw opened 1 year ago
Yes, I hope there is a good way to do so. Cause in the old implementation we were only returning the files in the folder, but since we decided that the concept of folders must not exist on a backend, we now return all files that have a backend path that starts with the requested sub-path.
Ok, seems we should simply do:
backend = audbackend.access(
'artifactory',
'https://audeering.jfrog.io/artifactory',
'data-public',
)
for p in backend._repo:
print(p)
It returns all files in the repository and is really fast.
Unfortunately, it seems the solution I proposed is not working with anonymous access at the moment. They might change it in the future, though:
https://jfrog.atlassian.net/browse/RTFACT-7384
For the record, this is the commit where I replaced glob()
with simply listing everything in the repository:
https://github.com/audeering/audbackend/pull/130/commits/5f2ccc55bb7414be11eb47cbb61d7bf8898e7a01
Under the hood it does the following:
import audbackend
backend = audbackend.access(
'artifactory',
'https://audeering.jfrog.io/artifactory',
'data-public',
)
backend._use_legacy_file_structure()
query = [
'items.find',
{
'repo': {'$eq': 'data-public'},
'type': 'file',
},
'.include',
['name'],
]
backend._repo.aql(*query)
[{'name': '005d2b91-5317-0c80-d602-6d55f0323f8c-1.1.0.zip'}, {'name': '014f82d8-3491-fd00-7397-c3b2ac3b2875-1.1.0.zip'}, {'name': '0241da63-dc03-ccb3-623c-80020b913b0d-1.1.0.zip'}, {'name': '02ae4ee2-063a-b720-897e-12400ea54f89-1.1.0.zip'}, {'name': '03ac9ce7-461e-9991-b925-89c2697d7f38-1.1.0.zip'}, {'name': '040df0a2-bdbf-6590-f86a-947f94ae7e12-1.1.0.zip'}, {'name': '0415d83a-9d8e-2cc6-86b0-2b0649c4e67d-1.1.0.zip'},`...]
But returns 403 for an anonymous user (even if the repository allows anonymous access).
As can be seen in https://github.com/audeering/audb/commit/1e5a0cfb83613d4bebf31cf5bc33913905d61baf the following works:
backend = audbackend.access(
'artifactory',
'https://audeering.jfrog.io/artifactory',
'data-public',
)
for p in backend._repo.path:
print(p)
It's not recursive.
You can do this recursive by looping as long as it finds paths. But maybe that is as slow as glob()
then?
It's also mentioned at https://github.com/devopshq/artifactory/issues/423 that glob()
is slow, especially compared to the underlying API call. So maybe we could also fix it by fixing the original code inside dohq-artifactory
.
For the implementtion of
audbackend.Artifactory.ls()
we switched to useglob('*/**')
in thedev
branch, which is very slow. Inmain
we avoided usingglob()
inls()
, which we should also do indev
.