refgenie / refgenconf

A Python object for standardized reference genome assets.
http://refgenie.databio.org
BSD 2-Clause "Simplified" License
3 stars 6 forks source link

seek remote assets return value #121

Closed stolarczyk closed 3 years ago

stolarczyk commented 3 years ago

once we include unarchived assets on the server, we will be able to add RefGenConf.seekr method, which will point to a remote location of a file that seek_key points to. Two questions:

  1. What about seek_keys that point to nonexistent files? Introduced in bowtie2_index recipe, e.g. hg38/bowtie2_index
  2. Do we want to return the refgenieserver endpoint path? or the S3 location of the file that the endpoint redirects to? or the S3Uri? or all are an option and one of them is the default?
    • http://refgenomes.databio.org/v3/...
    • http://awspds.refgenie.databio.org/refgenomes.databio.org/....
    • s3://awspds.refgenie.databio.org/refgenomes.databio.org/...
nsheff commented 3 years ago
refgenie seek t7/bowtie2_index
/home/nsheff/code/refgenie_sandbox/alias/t7/bowtie2_index/default/t7

I think you'd point to:

http://awspds.refgenie.databio.org/refgenomes.databio.org/t7/bowtie2_index/default/t7

so it wouldn't point to a file, if the seek key doesn't point to a file.

  1. I'd give an option to return either the S3 URL location, or the S3Uri. Maybe it's refgenie seekr and refgenie seekr -s3.

Or, this could just be refgenie seek if there's a config mode like mode: remote? that turns all seeks into remote URLs?