Closed ryan-williams closed 8 years ago
Wow, there's a lot here. Left some comments. My main question is how this affects runtimes on the cluster, since it involves a lot of spark partitioning logic that I don't have a good understanding of the performance implications
Also a style point fwiw: I find it helpful for methods to have an intro describing what it does, in addition to the argument descriptions
Cool, thanks for the replies @ryan-williams . I took another pass, LGTM
push the
PartitionedRegions
abstraction out to callers, so that they can start to separate loci+read partitioning plumbing from application logic.recapitulates some of #502 but keying off
SampleId
s instead ofSampleName
s; still collapses the VAFHistogram mixture-model into one for all samples in an app, per https://github.com/hammerlab/guacamole/pull/502#discussion_r73574358This change is