secastel / phaser

phasing and Allele Specific Expression from RNA-seq
GNU General Public License v3.0
107 stars 37 forks source link

Phasing multiple samples in the same VCF file at the same time #9

Closed npklein closed 8 years ago

npklein commented 8 years ago

We have a VCF file with many thousand samples that we would like to phase. Currently using phASER we can only phase one sample at a time, which means that for every sample we need to write out a new VCF file and merge them once all are finished.

Would it be possible to change it so that multiple samples in one VCF can be phased at the same time?

secastel commented 8 years ago

In the case of multiple samples with a single VCF, the best option is to run phASER separately for each sample, and then merge the resulting output VCFs, as you suggest. The benefit from this approach is that each sample can be run in parallel, as opposed to running each sample one after another using a single instance of phASER, which for thousands of samples would take quite a long time. There exist various method which allow the efficient concatenation of VCFs. As such we don't think it would be in users best interest to allow a single instance of phASER to run on multiple samples, so will not be including this in a future updated.

Sorry that I can't be more helpful in this respect.