I'm currently trying to call variants on a set of candidate genes (specified via variant_regions) for 63 WGS samples, and the 'callable regions' step is taking far too long.
Looking at the times in bcbio-nextgen-commands.log (attached), the main culprit seems to be picard CollectSequencingArtifactMetrics. The command doesn't use the --INTERVALS option, so presumably it's running for the whole genome rather the specified BED file. It also appears to be using only one core, so it's only running for one sample at a time on a 28-core node.
Hi,
I'm currently trying to call variants on a set of candidate genes (specified via
variant_regions
) for 63 WGS samples, and the 'callable regions' step is taking far too long.Looking at the times in bcbio-nextgen-commands.log (attached), the main culprit seems to be picard CollectSequencingArtifactMetrics. The command doesn't use the
--INTERVALS
option, so presumably it's running for the whole genome rather the specified BED file. It also appears to be using only one core, so it's only running for one sample at a time on a 28-core node.Is there any way we can get this running faster?
bcbio-nextgen-commands.log bcbio-nextgen-debug.log