vplagnol / ExomeDepth

ExomeDepth R package for the detection of copy number variants in exomes and gene panels using high throughput DNA sequencing data.
59 stars 26 forks source link

Facing some troubles while running ExomeDepth #47

Open frasca17 opened 2 years ago

frasca17 commented 2 years ago

Hi @vplagnol , I am a bioinformatician and I am facing some troubles while running ExomeDepth.

I am actually trying to use ExomeDepth to call CNVs on Panel data of 24 mitochondrial patients (our gene panels are very big, almost resuming a real Exome).

These are the commands I use, following ExomeDepth official R vignettes:

-library(ExomeDepth)

-my.counts <- getBamCounts(bed.file = "panel.bed", bam.files = c("MT1.bam", "MT2_S12.bam", "MT3.bam", "MT4.bam", "MT5.bam", "MT6.bam", "MT7.bam", "MT8.bam", "MT9.bam", "MT10.bam", "MT11.bam", "MT12.bam", "MT13.bam", "MT14.bam", "MT15.bam", "MT16.bam", "MT17.bam", "MT18.bam", "MT19.bam", "MT20.bam", "MT21.bam", "MT22.bam", "MT23.bam", "MT24.bam"), include.chr = TRUE, referenceFasta = "hg19.fa")

-ExomeCount.dafr <- as(my.counts, 'data.frame')

-ExomeCount.dafr$chromosome <- gsub(as.character(ExomeCount.dafr$chromosome), pattern = 'chr', replacement = '')

-my.test <- my.counts$MT1.bam

-my.ref.samples <- c("MT2_S12.bam", "MT3.bam", "MT4.bam", "MT5.bam", "MT6.bam", "MT7.bam", "MT8.bam", "MT9.bam", "MT10.bam", "MT11.bam", "MT12.bam", "MT13.bam", "MT14.bam", "MT15.bam", "MT16.bam", "MT17.bam", "MT18.bam", "MT19.bam", "MT20.bam", "MT21.bam", "MT22.bam", "MT23.bam", "MT24.bam")

-my.reference.set <- as.matrix(ExomeCount.dafr[, my.ref.samples])

-my.choice <- select.reference.set (test.counts = my.test, reference.counts = my.reference.set, bin.length = (ExomeCount.dafr$end -

-ExomeCount.dafr$start)/1000, n.bins.reduced = 10000)

-my.matrix <- as.matrix( ExomeCount.dafr[, my.choice$reference.choice])

-my.reference.selected <- apply(X = my.matrix, MAR = 1, FUN = sum)

-all.exons <- new('ExomeDepth', test = my.test, reference = my.reference.selected, formula = 'cbind(test, reference) ~ 1')

-all.exons <- CallCNVs(x = all.exons, transition.probability = 10^-4, chromosome = ExomeCount.dafr$chromosome, start = ExomeCount.dafr$start, end = ExomeCount.dafr$end, name = ExomeCount.dafr$exon)

Regarding this I have some questions:

  1. How is it possible that CallCNV() function call just 1 CNV always for the same chromosome? In my case, whatever test sample I use as input it always result with: "Number of calls for chromosome 16 : 1". Am I doing something wrong?
  2. I examinated the ExomeCount data.frame that you provide as data example in the vignettes. Why when I run the getBamCounts() function the resulting dataframe is different from the one thet you provide? For instance, seqnames, ranges and strand as well as names are not present in my.counts once I run the command with my input bam files. What am I missing to do?
  3. When I loop over multiple samples as suggested in the vignettes I use these commands: -ExomeCount.mat <- as.matrix(ExomeCount.dafr[, grep(names(ExomeCount.dafr), pattern = 'Exome.*')]) -nsamples <- ncol(ExomeCount.mat) My question is: why the ExomeCount.mat object is empty when I use the same commands?

Thank you in advance for the attention,

I hope you can help me soon