BD2KGenomics / dcc-redwood-client

Apache License 2.0
2 stars 1 forks source link

Request for download output directory to be named as submitter sample id #10

Open wshands opened 7 years ago

wshands commented 7 years ago

From Katrina Learned: Could the Submitter Sample ID also become the name of the directory that the output files are placed in when we download the output files? Currently, when the output files are downloaded from Redwood, the output files for each sample are downloaded into a directory named as the "Bundle name UUID" and are difficult for us to navigate.

klearned commented 7 years ago

What's the time frame on this? With several hundred samples being uploaded, and processed shortly, having UUID directory names could be daunting for us.

benjaminran commented 7 years ago

Sorry for the unresponsiveness on this. For future reference, the underlying icgc-storage-client download command is used for download. Note the output-layout option in the help text below. Using --output-layout=filename might make things easier for you.

To use this with the current prod system you'd have to edit the download script directly (or copy it's invocation of icgc-storage-client.jar). But with redwood-client:1.0.0 and greater (pending the upcoming update of prod redwood) you'll be able to just run

icgc-storage-client download --manifest manifest.txt --output-dir output --output-layout filename

Help text:

root@c547a220dc63:/dcc# icgc-storage-client help download
Usage: icgc-storage-client download [options]
  Command:
    download   Retrieve file object(s) from the remote storage repository
  Options:
  * --output-dir
       Path to output directory
    --offset
       The byte position in source file to begin download from
       Default: 0
    --validate
       Perform check of MD5 checksum (if available)
       Default: true
    --length
       The number of bytes to download
       Default: -1
    --verify-connection
       Verify connection to repository
       Default: true
    --object-id
       Object id to download
       Default: []
    --index
       Download file index if available?
       Default: true
    --output-layout
       Layout of the output-dir. One of 'bundle' (saved according to filename under GNOS bundle id directory), 'filename' (saved according to filename in output directory), or 'id' (saved
       according to object id in output directory)
       Default: filename
    --force
       Force re-download (override local file)
       Default: false
    --manifest
       Manifest id, url, or path to manifest file
klearned commented 7 years ago

@benjaminran Thanks for this info. Sorry for not getting back! When will "redwood-client:1.0.0" be in-use? What is the Core-client associated with it? Thanks!

benjaminran commented 7 years ago

We're planning an upgrade of the prod system next week (by June 3), at which point you'll be able to use core-client:1.2.0 and up (which uses redwood-client:1.0.0+).

klearned commented 7 years ago

Sounds good, thanks!