bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
988 stars 354 forks source link

bcbio for panels #1715

Closed deber1980 closed 5 years ago

deber1980 commented 7 years ago

Hello,

We have several samples that were created using "Trusight RapidCapture Cancer Panel" on which I ran the default configuration of bcbio for variant calling (see example of the config below). This is ~100 genes panel. When comparing the results to other pipeline (I believe Illumina's) the results were not concordant. Some regions were not covered properly for example while they should have.

Are there any best practices on how to tweak the bcbio to support high-depth panels? Any other recommendations?

Regards, Y.

details:

chapmanb commented 7 years ago

Thanks for the discussion, happy to figure out the right settings for your analysis and I have a couple of clarifying questions:

Hope this helps.

deber1980 commented 7 years ago

Hello Brad and thanks for the prompt reply.

Thanks again

biocyberman commented 6 years ago

I am working in a clinical setting. While whole genome sequencing may be the future in some years for us, we still do mostly gene panels, and to less extent, exomes. I've been looking at Cpipe and trying to install it. But it is still very buggy, and failing to install out of the box. The Cpipe team is not very active in developing the tool further (at least from what I see on github). So I am rewriting the code for installation and execution process, while keeping the analysis steps intact. Cpipe also promises some nice features:

With that said, I imagine it may take some tweaking to adapt bcbio-nextgen for gene panels. I would prefer this route if because I am more familiar with bcbio and it has many other features.

biocyberman commented 6 years ago

@chapmanb Brad and @deber1980 Which variant caller do you use for gene panel variant calling?

chapmanb commented 6 years ago

Vang -- thanks for the discussion. From a variant calling perspective, bcbio should handle panels. If these are pull downs using GATK4 HaplotypeCaller for germline calling and VarDict for somatic calling would be my recommendation. Custom amplicon panels do take some adjustment and we have an open issue to look into support TruSeq (#1837). What issues are you envisioning needing tweaking for in bcbio?

Cpipe is a nice tool with some great features. We'd definitely be open to integrating bcbio output with external databases and also welcome recommendations on improving QC reporting. I see these as external projects that can interface with bcbio and if you end up working in these directions happy to discuss how best to do. Thanks again for the discussion.

biocyberman commented 6 years ago

Brad When I wrote "tweaking", I thought about this paper which discuss about optimizing GATK run configuration for gene panel: https://doi.org/10.1186/s12859-017-1537-8 I also saw Pisces from Illumina (https://github.com/Illumina/Pisces) that supports somatic and targeted sequencing variant calling. So if Pisces is to be used, we need a wrapper and probably a bioconda package for it. Both do not exist at the moment.

Regarding running Cpipe and Bcbio in parallel, that will be a hard work to maintain both pipelines, not to mention interfacing the two.

chapmanb commented 6 years ago

Vang -- we'd be happy to work on improving calling for panels if folks run into issues and want to improve. Unfortunately the paper you linked is for Ion Torrent data so is likely to be more specific to that technology than panels in general, but if there are other improvements we can add to what we have to panel cases happy to consider those.

Pisces would be useful for tumor only calling generally and I've looked at in the past but the biggest blocker in the requirement on having the .net runtime available, which is a heavy dependency and doesn't currently have a conda package. So definitely possible to explore but would take some effort.

Happy to support you in any work you do in these directions and thank you again for the discussion.

biocyberman commented 6 years ago

You are right. I work with Ion Torrent data and about the dotnet requirement. I will explore more and get back about this. That task of analyzing gene panel is there, so I will definitely have to come up with a solution. Thanks for the discussion as well :)

roryk commented 5 years ago

Closing for now since there hasn't been any action on it and the suggests were for Ion Torrent which we don't support. We can open up a separate issue if there are other improvements that could be made to panel sequencing.