Open pabloati opened 4 years ago
Sorry for the delay. You can limit the analysis to genic regions by creating a BED file of "accessible" regions:
cnvkit.py batch file.bam -n -m wgs -f softmasked.fa --access Genes.bed
Or more elaborately, exclude the intergenic regions as well as any other masked regions near or within genes:
# Streamlining...
bedtools slop -b 5000000 Genes.bed ... \
| bedtools merge ... \
| bedtools complement ... \
> intergenic.bed
cnvkit.py access softmasked.fa -x intergenic.bed -o access-genic.bed
cnvkit.py batch --access access-genic.bed ...
Although if you have a large reference genome and no control samples, and need to find precise structural variants below 1Mbp in size, another tool like Manta may work better for your needs.
Hello,
I would like to perform a CNV calling on WGS samples from Maize only on gene's positions. For that, I have created a script using the genes as target and the batch command. However, the final segments are too large, over 1Mb and the callings are not specific to the positions of the genes.
We would like to call the CNVs only in the target regions. The reference genome used is soft masked.
Here you can find the script used for the workflow:
cnvkit.py batch file.bam -n -m wgs -f softmasked.fa --targets Genes.bed
¿Could you offer some recommendations to perform the calling for our needs?
Thank you very much in advanced