gatk-workflows / five-dollar-genome-analysis-pipeline

Workflows used for WGS data processing -- replaced by https://github.com/gatk-workflows/gatk4-genome-processing-pipeline
https://gatk.broadinstitute.org/hc/en-us
BSD 3-Clause "New" or "Revised" License
57 stars 45 forks source link

Localization can be made optional for more stages. #23

Open darkskiez opened 5 years ago

darkskiez commented 5 years ago

For example the SortSam task could be modified to operate on I=/dev/stdin with gsutil cat invoked if the path was a google cloud api.

Alternatively the google cloud bucket could be mapped as a docker volume.

This could save a few more hours of processing time for a run as this time is spent copying the data back and forth in the critical path.