etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
540 stars 164 forks source link

germline WES #628

Closed lmanchon closed 3 years ago

lmanchon commented 3 years ago

--Hi,

is it possible to use cnvkit to make analysis on germline WES and do i need to mark duplicate reads after BWA mapping step ? Maybe some people have already done that and can give me indications in the process. thank you --

SouzaBB commented 3 years ago

Hi,

The answer is yes, you can use it for Germline analysis. I use it for a small panel so you need to perform your own validation for your panel. I do Mark Duplicates and BQSR.

Hope that helps.

lmanchon commented 3 years ago

But in the Notes (end of the page) here: https://cnvkit.readthedocs.io/en/stable/nonhybrid.html, it was mentionned to "not mark duplicates in the BAM files". Here is my command line i used: cnvkit.py batch my.bam -n -m amplicon --segment-method hmm-germline -f Homo_sapiens_hg38.fasta -p 0 --annotate refFlat.txt --short-names -t exome.bed -d out_cnvkit --diagram --scatter

is this correct or something missing. My WES sequencing is in the order of ~200x thx.

SouzaBB commented 3 years ago

I follow what's in here https://cnvkit.readthedocs.io/en/stable/pipeline.html#bam-file-preparation

Also, I have validated the pipeline with commercial and patients samples so I know it's working well!

tskir commented 3 years ago

Hi @lmanchon @SouzaBB, yes CNVkit is perfectly capable of calling germline WES.

Regarding duplicates, as stated in the documentation, you should never mark them only if the sequencing method is TAS (targeted amplicon sequencing) because that would heavily skew the results.

For WES/WGS, removing PCR duplicates is in principle beneficial, but only as long as you don't have a large proportion of false positive detections which could again skew the results. So in general the answer is—yes, mark them, but I recommend trying the workflow both with and without this to see what works best for your particular panel

tskir commented 3 years ago

Hi @lmanchon, just wanted to confirm if this resolves your question or if you had any other comments or concerns?

lmanchon commented 3 years ago

Hi tskir,

just last question: is there a way to plot diagram for only one chromosome and not for all (default) ?

thx

tskir commented 3 years ago

@lmanchon I'm afraid there is not such a functionality built in at the moment; however, you can achieve this by filtering your CNR/CNS files to only contain one chromosome, and then feed them into cnvkit.py diagram. Using the example files included with CNVkit code, this would looks something like this:

head -n1 test/formats/amplicon.cnr > amplicon_chr2.cnr
awk -F$'\t' '$1 == "chr2"' test/formats/amplicon.cnr >> amplicon_chr2.cnr
cnvkit.py diagram amplicon_chr2.cnr -o amplicon_chr2.pdf

The result would then look like this: image