Closed ftucos closed 4 years ago
I figured out it was an overflow problem that R enconters both generating the index file (fa.fai) and parsing the genome.
I had to generate the index in bash with samtools and get the GC content of each exonic sequence with bedtools
$ bedtools nuc -fi hg19.fa -bed S07604514_hs_hg19/S07604514_Covered_edit.bed | awk '{$4=""; print $0}' > AgilentV6_exons_GC_count.txt
Than I manually added the GC contents columnt to the dataframe with the exome reads.
Thank you for figuring it out!
Sir
I got the same problem, how did you manually added the GC contents column after calculate GC content by bedtools nuc -fi hg19.fa -bed S07604514_hs_hg19/S07604514_Covered_edit.bed | awk '{$4=""; print $0}' > AgilentV6_exons_GC_count.txt
?
my.counts = getBamCounts(bed.frame=segments, bam.files=BAMFiles, include.chr=FALSE, referenceFasta='hg38.fa')
Thanks in advance
Whenever I try to add a reference fasta file for computing the GC content, i get an error like this
Reference fasta file provided so ExomeDepth will compute the GC content in each window Error in value[[3L]](cond) : record 38397 (chr11:134201957-134202041) failed file: data/hg19.fa
I have tried both with the UCSC and Ensembl build but I get the same error on a different chromosome.