Closed gevro closed 2 years ago
This type of question is better asked on the support site https://support.bioconductor.org.
Use the which=
argument to ScanBamParam()
to specify specific regions. Using the file indicated on ?scanBam
, we might
fl <- system.file("extdata", "ex1.bam", package="Rsamtools", mustWork=TRUE)
countBam(fl)
## space start end width file records nucleotides
##1 NA NA NA NA ex1.bam 3307 116551
countBam(fl, param = ScanBamParam(which = GRanges("seq1:1-1000")))
## space start end width file records nucleotides
## 1 seq1 1 1000 1000 ex1.bam 924 32529
See ?BamFile
and this part of the example
## Use 'yieldSize' to iterate through a file in chunks.
bf <- open(BamFile(fl, yieldSize=1000))
while (nrec <- length(scanBam(bf)[[1]][[1]]))
cat("records:", nrec, "\n")
close(bf)
for iterating through a bam file.
Probably something like ?GenomicAlignments::readGAlignments
is more 'user friendly', and operates in the same way.
?GenomicFiles::reduceByYield
and reduceByRange
and reduceRanges
might be relevant.
Thank you!
Is there a method for loading a specific reproducible chunk of a BAM/CRAM file? This would be useful for very large BAM/CRAM files to avoid loading it all into memory, and in separate processes to load specific chunks, from 1 ... X, as the user defines.