vplagnol / ExomeDepth

ExomeDepth R package for the detection of copy number variants in exomes and gene panels using high throughput DNA sequencing data.
59 stars 26 forks source link

Confusing warnings using getBamCounts (ExomeDepth v.1.1.15) #37

Open PaolaD opened 3 years ago

PaolaD commented 3 years ago

Hi @vplagnol, i switched to the new version of ExomeDepth and saw that when getBamCounts finish to parse each bam file, i get a warning that was partially discussed in issue #22 but i think this specific problem was lost as some point in the thread.
What i wanted to understand better is what 'x' in this warning is referring to, the bed file? I'm using GRCh38 reference with decoy, my bed file contains contigs from chr1 to chrX-Y, but the x list in the warnings stops at 22 (when chrX and chrY are parsed with getBamCounts). However, read counts at the end are formatted properly and are not empty, so can i skip this warnings or is it better to do some further test to be sure?

Packages versions:

packageVersion("ExomeDepth") [1] ‘1.1.15’ packageVersion("GenomicRanges") [1] ‘1.40.0’

Command used:

my.bam = c(my.ctrl,my.cases) my.counts <- getBamCounts(bed.file = bed_file, bam.files = my.bam, include.chr = FALSE, referenceFasta = reference)

The warning i get:

50: In .Seqinfo.mergexy(x, y) : Each of the 2 combined objects has sequence levels not in the other:

  • in 'x': 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22
  • in 'y': chrM, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270708v1_random, chr1_KI2709v1_random, chr1_KI270710v1_random, chr1_KI270711v1_random, chr1_KI270712v1_random, chr1_KI270713vrandom, chr1_KI270714v1_random, chr2_KI270715v1_random, chr2_KI270716v1_random, chr3_GL000221v1_ranm, chr4_GL000008v2_random, chr5_GL000208v1_random, chr9_KI270717v1_random, chr9_KI270718v1_random, r9_KI270719v1_random, chr9_KI270720v1_random, chr11_KI270721v1_random, chr14_GL000009v2_random, chr_GL000225v1_random, chr14_KI270722v1_random, chr14_GL000194v1_random, chr14_KI270723v1_random, chr1KI270724v1_random, chr14_KI270725v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16I270728v1_random, chr17_GL000205v2_random, chr17_KI270729v1_random, chr17_KI270730v1_random, chr22_270731v1_random, chr22_KI270732v1 [... truncated]

Many thanks, Paola

vplagnol commented 3 years ago

Paola,

So it appears that your chromosomes labels are not consistent between reference and test samples. I suspect some ExomeDepth piece of code sorts out the issues with chr1-chr22 by removing the "chr" prefix. But you have all these other contigs that are not in the reference and you get a warning for that. I don't think it's worrying... as long as your counts make sense of course.

PaolaD commented 3 years ago

Hi @vplagnol, thanks for the reply! I'm benchmarking using 1000 genomes bam files with the reference they used to re-align the reads to test the tool before applying it to my dataset. There shouldn't be any inconsistency between those two...

Paola