Open vmukhina opened 2 years ago
Yes, that's true. CNVkit doesn't tend to give useful calls on alternative contigs; read mapping is inconsistent.
Here's the filter applied to sequence names in the commands access
and antitarget
:
https://github.com/etal/cnvkit/blob/master/cnvlib/antitarget.py#L115-L122
You could turn off this behavior by calling access.do_access(..., skip_noncanonical=False)
through cnvlib:
https://github.com/etal/cnvkit/blob/master/cnvlib/access.py#L15
I run cnvkit access (0.9.9) on two files with the same fasta sequence labelled differently and I got different results: NC001526.4 was skipped whereas Nt_001526.4 was added to the output .bed file. Here are both logs. Nt_001526.4: Scanning for accessible regions Accessible region Nt_001526.4:0-7906 (size 7906) Nt_001526.4: Joining over small gaps Wrote test.bed with 1 regions and NC001526.4: Scanning for accessible regions Accessible region NC001526.4:0-7906 (size 7906) Wrote test.bed with 0 regions
Same, cnvkit ignores all NC_ sequences in refseq HG38 assembly so that regions from primary assembly will never appear in the .bed file and there will be no cnv calling for these regions.