Closed deber1980 closed 5 years ago
Thanks for the discussion, happy to figure out the right settings for your analysis and I have a couple of clarifying questions:
Is this a somatic sample where you want to identify low frequency variants? If so, you currently have a germline calling setup which will only call homozygous and heterozygous mutations. To do somatic calling, add phenotype: tumor
and swap the callers to use somatic callers. vardict
is a good choice for panels. You could also use freebayes
and mutect
or mutect2
if you want a GATK-based caller.
Which sample pipeline are you comparing the outputs to? How are you determining concordance?
You mention a coverage issue and I'm not sure what you mean. Could you provide an example of an issue you're finding?
Hope this helps.
Hello Brad and thanks for the prompt reply.
Thanks again
I am working in a clinical setting. While whole genome sequencing may be the future in some years for us, we still do mostly gene panels, and to less extent, exomes. I've been looking at Cpipe and trying to install it. But it is still very buggy, and failing to install out of the box. The Cpipe team is not very active in developing the tool further (at least from what I see on github). So I am rewriting the code for installation and execution process, while keeping the analysis steps intact. Cpipe also promises some nice features:
With that said, I imagine it may take some tweaking to adapt bcbio-nextgen for gene panels. I would prefer this route if because I am more familiar with bcbio and it has many other features.
@chapmanb Brad and @deber1980 Which variant caller do you use for gene panel variant calling?
Vang -- thanks for the discussion. From a variant calling perspective, bcbio should handle panels. If these are pull downs using GATK4 HaplotypeCaller for germline calling and VarDict for somatic calling would be my recommendation. Custom amplicon panels do take some adjustment and we have an open issue to look into support TruSeq (#1837). What issues are you envisioning needing tweaking for in bcbio?
Cpipe is a nice tool with some great features. We'd definitely be open to integrating bcbio output with external databases and also welcome recommendations on improving QC reporting. I see these as external projects that can interface with bcbio and if you end up working in these directions happy to discuss how best to do. Thanks again for the discussion.
Brad When I wrote "tweaking", I thought about this paper which discuss about optimizing GATK run configuration for gene panel: https://doi.org/10.1186/s12859-017-1537-8 I also saw Pisces from Illumina (https://github.com/Illumina/Pisces) that supports somatic and targeted sequencing variant calling. So if Pisces is to be used, we need a wrapper and probably a bioconda package for it. Both do not exist at the moment.
Regarding running Cpipe and Bcbio in parallel, that will be a hard work to maintain both pipelines, not to mention interfacing the two.
Vang -- we'd be happy to work on improving calling for panels if folks run into issues and want to improve. Unfortunately the paper you linked is for Ion Torrent data so is likely to be more specific to that technology than panels in general, but if there are other improvements we can add to what we have to panel cases happy to consider those.
Pisces would be useful for tumor only calling generally and I've looked at in the past but the biggest blocker in the requirement on having the .net runtime available, which is a heavy dependency and doesn't currently have a conda package. So definitely possible to explore but would take some effort.
Happy to support you in any work you do in these directions and thank you again for the discussion.
You are right. I work with Ion Torrent data and about the dotnet requirement. I will explore more and get back about this. That task of analyzing gene panel is there, so I will definitely have to come up with a solution. Thanks for the discussion as well :)
Closing for now since there hasn't been any action on it and the suggests were for Ion Torrent which we don't support. We can open up a separate issue if there are other improvements that could be made to panel sequencing.
Hello,
We have several samples that were created using "Trusight RapidCapture Cancer Panel" on which I ran the default configuration of bcbio for variant calling (see example of the config below). This is ~100 genes panel. When comparing the results to other pipeline (I believe Illumina's) the results were not concordant. Some regions were not covered properly for example while they should have.
Are there any best practices on how to tweak the bcbio to support high-depth panels? Any other recommendations?
Regards, Y.
details: