Closed lmanchon closed 3 years ago
Hi,
The answer is yes, you can use it for Germline analysis. I use it for a small panel so you need to perform your own validation for your panel. I do Mark Duplicates and BQSR.
Hope that helps.
But in the Notes (end of the page) here: https://cnvkit.readthedocs.io/en/stable/nonhybrid.html, it was mentionned to "not mark duplicates in the BAM files". Here is my command line i used: cnvkit.py batch my.bam -n -m amplicon --segment-method hmm-germline -f Homo_sapiens_hg38.fasta -p 0 --annotate refFlat.txt --short-names -t exome.bed -d out_cnvkit --diagram --scatter
is this correct or something missing. My WES sequencing is in the order of ~200x thx.
I follow what's in here https://cnvkit.readthedocs.io/en/stable/pipeline.html#bam-file-preparation
Also, I have validated the pipeline with commercial and patients samples so I know it's working well!
Hi @lmanchon @SouzaBB, yes CNVkit is perfectly capable of calling germline WES.
Regarding duplicates, as stated in the documentation, you should never mark them only if the sequencing method is TAS (targeted amplicon sequencing) because that would heavily skew the results.
For WES/WGS, removing PCR duplicates is in principle beneficial, but only as long as you don't have a large proportion of false positive detections which could again skew the results. So in general the answer is—yes, mark them, but I recommend trying the workflow both with and without this to see what works best for your particular panel
Hi @lmanchon, just wanted to confirm if this resolves your question or if you had any other comments or concerns?
Hi tskir,
just last question: is there a way to plot diagram for only one chromosome and not for all (default) ?
thx
@lmanchon I'm afraid there is not such a functionality built in at the moment; however, you can achieve this by filtering your CNR/CNS files to only contain one chromosome, and then feed them into cnvkit.py diagram
. Using the example files included with CNVkit code, this would looks something like this:
head -n1 test/formats/amplicon.cnr > amplicon_chr2.cnr
awk -F$'\t' '$1 == "chr2"' test/formats/amplicon.cnr >> amplicon_chr2.cnr
cnvkit.py diagram amplicon_chr2.cnr -o amplicon_chr2.pdf
The result would then look like this:
--Hi,
is it possible to use cnvkit to make analysis on germline WES and do i need to mark duplicate reads after BWA mapping step ? Maybe some people have already done that and can give me indications in the process. thank you --