populationgenomics / analysis-runner

MIT License
2 stars 4 forks source link

Generic files to release #630

Closed EddieLF closed 1 year ago

EddieLF commented 1 year ago

This is a script which borrows a lot from the copy_sample_cram_to_release script. Reason we need a new script is because some collaborators aren't happy with receiving just CRAMs, they want the original fastq files.

The idea here is that you create a .txt file containing the URLs of all the files you want to copy to the release bucket. Then, you upload this file into a bucket and specify its path in the argument to this script.

The txt file will then be downloaded onto the Hail disk, where it reads the URLs and executes copy commands on all the files into a directory with today's date in the release bucket.

One thing I'm unsure about is the text file encoding. I created a test .txt file in vscode and it defaulted to 'us-ascii' encoding, but I'm not sure if this is generally applicable. Is there a better practice?