Closed yingca1 closed 11 months ago
Can ais start download
only download the public gs:// datasets?
As aside, I wonder what bad status
is?...
The primary motivation behind downloader was downloading raw http sources. But here you have two regular buckets where we can use ours or vendor-documented APIs, etc. And so the first thing that comes to mind is something like:
ais cp gs://ais_video ais://ais_video --template="dataset1/{00000..20000}.tar"
See ais cp --help
for details.
Secondly, logs will provide more information (including "bad status"). Especially if the downloader (module) is configured for verbose logging:
$ ais config cluster log.modules downloader
$ ais cluster download-logs
PS. generally, it'll help us if you use the latest bits from github master and send us the logs. If not, then at least ais show cluster
PPS: quick experiment with downloader and a google bucket:
$ ais start download "gs://abcdef-imagenet/imagenet-val-{000037..000045}.tar" ais://dst
Warning: destination bucket ais://dst doesn't exist. Bucket with default properties will be created.
Started download job dnl-KxNDoJt9L
To monitor the progress, run 'ais show job dnl-KxNDoJt9L --progress'
$ ais show job dnl-KxNDoJt9L --progress
Files downloaded: 0/9 [--------------------------------------------------------------] 0 %
imagenet-val-000043.tar 26.6MiB/132.8MiB [===========>--------------------------------------------------| 00:00:00 ] 0.0 b/s
imagenet-val-000038.tar 116.5MiB/129.6MiB [=======================================================>------| 00:00:00 ] 0.0 b/s
imagenet-val-000040.tar 87.3MiB/128.0MiB [=========================================>--------------------| 00:00:00 ] 0.0 b/s
...
imagenet-val-000043.tar 132.8MiB/132.8MiB [==============================================================| 00:00:00 ] 0.0 b/s
All files successfully downloaded.
addressed in 2628029e3cc014723769f7529b94ae0bc02fcfb6
PPS: quick experiment with downloader and a google bucket:
$ ais start download "gs://abcdef-imagenet/imagenet-val-{000037..000045}.tar" ais://dst Warning: destination bucket ais://dst doesn't exist. Bucket with default properties will be created. Started download job dnl-KxNDoJt9L To monitor the progress, run 'ais show job dnl-KxNDoJt9L --progress' $ ais show job dnl-KxNDoJt9L --progress Files downloaded: 0/9 [--------------------------------------------------------------] 0 % imagenet-val-000043.tar 26.6MiB/132.8MiB [===========>--------------------------------------------------| 00:00:00 ] 0.0 b/s imagenet-val-000038.tar 116.5MiB/129.6MiB [=======================================================>------| 00:00:00 ] 0.0 b/s imagenet-val-000040.tar 87.3MiB/128.0MiB [=========================================>--------------------| 00:00:00 ] 0.0 b/s ... imagenet-val-000043.tar 132.8MiB/132.8MiB [==============================================================| 00:00:00 ] 0.0 b/s All files successfully downloaded.
aistorage/aisnode:3.18
?ais cluster download-logs
this command cannot workdo you use aistorage/aisnode:3.18
aisnode
docker image is usually somewhat behind. I triggered rebuild and push - it'll show up in a few minutes.
ais cluster download-logs
Assuming, you cloned or go get
https://github.com/NVIDIA/aistore, run make cli
from the root. It'll work.
ais bucket props set ais://lpr-vision-copy backend_bck=gcp://lpr-vision
ais start download gs://lpr-vision/dir/prefix- ais://lpr-vision-copy
After some testing, it was found that only an AIS bucket that has been configured with a backend bucket can download files from the configured backend bucket.
reopening
No, it actually works as prescribed. Here's the full story, and notice templated source-to-download below.
$ ais ls
NAME PRESENT
ais://dst yes
NAME PRESENT
gs://imagenet yes
Total: [GCP bucket: 1] ========
$ ais ls ais://dst
NAME SIZE
and
$ ais show bucket ais://dst | grep backend
backend_bck.name
backend_bck.provider
and also
$ ais ls gs://imagenet
...
imagenet-val-000039.tar 23.87MiB no
imagenet-val-000040.tar 22.79MiB no
imagenet-val-000041.tar 22.40MiB no
imagenet-val-000042.tar 22.99MiB no
imagenet-val-000043.tar 24.11MiB no
imagenet-val-000044.tar 23.31MiB no
imagenet-val-000045.tar 24.03MiB no
imagenet-val-000046.tar 24.22MiB no
imagenet-val-000047.tar 23.12MiB no
...
Now do it:
$ ais start download "gs://imagenet/imagenet-val-{000042..000045}.tar" ais://dst
Started download job dnl-jkhnNM1xp
and done:
ais ls ais://dst
NAME SIZE
imagenet-val-000042.tar 22.99MiB
imagenet-val-000043.tar 24.11MiB
imagenet-val-000044.tar 23.31MiB
imagenet-val-000045.tar 24.03MiB
https://github.com/NVIDIA/aistore/blob/master/docs/cli/download.md#start-download-job
The download function in the document is basically giving this error.