Open allwu opened 8 years ago
When I add the -L argument in the BQSR stage, seems there is an error like below:
org.broadinstitute.gatk.utils.commandline.InvalidArgumentException: Argument with name 'L' isn't defined.
But on other stages it's okay.
Fixing in PR #55.
GATK stages such as BaseRecalibrator allows input argument
-L
which specifies the interest regions for analysis. This is particular useful for exomes analysis since the data is very sparse. Without this the recalibration for exome samples may be inaccurate since the model accounts for a lot of unrelated data regions.This
-L
option can be a file or string specifying one or more intervals in the format of chromosomes:position.One issue caused by this support is it may affect our automatic parallelization based on chromosomes. We may need to parse the intervals in user input and do the separation ourselves. For example, the easiest way could be:
The first step is enable this for BQSR, since the scatter function is automatically done by GATK Queue. We can then see how should we enable this in the future steps.