Open fnothaft opened 8 years ago
In lieu of BD2KGenomics/toil#644, BD2KGenomics/toil#703 BD2KGenomics/toil#706 and we are pondering the alternative solution of running a single, separate, large Spark cluster, provisioned by cgcloud, that is scaled up on demand. Thoughts:
Thanks for writing this up @hannes-ucsc; I was stuck in meetings from our scrum until 5PM and just got home, etc.
As an aside, are you/@benedictpaten working on https://github.com/BD2KGenomics/toil/issues/703?
@fnothaft #129
See #136.
@fnothaft #137
See #138.
As an aside, are you/@benedictpaten working on BD2KGenomics/toil#703?
Benedict will be. But he wants to do things right which will take longer than would fit in the time frame of this paper.
I am now leaning towards the single cluster solution and removing the dependence on BD2KGenomics/toil#706 and BD2KGenomics/toil#703 from this issue. If you agree, I will remove them from the task list. I will work on testing spot instances with cgcloud Spark clusters.
As an aside, are you/@benedictpaten working on BD2KGenomics/toil#703? Benedict will be. But he wants to do things right which will take longer than would fit in the time frame of this paper.
I am now leaning towards the single cluster solution and removing the dependence on BD2KGenomics/toil#706 and BD2KGenomics/toil#703 from this issue. If you agree, I will remove them from the task list. I will work on testing spot instances with cgcloud Spark clusters.
Let's discuss this on scrum today. Can @benedictpaten join us for scrum?
@hannes-ucsc how much work is involved in changing the Spark version of cgcloud.spark?
@hannes-ucsc how much work is involved in changing the Spark version of cgcloud.spark?
Looks pretty straightforward, actually.
Related to #113, but all the samples, not just 10, and just ADAM, no GATK.