Closed gacatag closed 2 years ago
The help page ?BamFile
says
yieldSize: Number of records to yield each time the file is read
from using 'scanBam' or, when 'length(bamWhich()) != 0', a
threshold which yields records in complete ranges whose sum
first exceeds 'yieldSize'.
so it will read in 100000 reads + the number to 'complete' the range. Remember to open()
the bam file. Here is an example (though without paired end reads)
> fl <- system.file("extdata", "ex1.bam", package="Rsamtools")
> bf = BamFile(fl, yieldSize = 3)
> param = ScanBamParam(which = GRanges(c("seq1:5", "seq2:1524")))
> open(bf)
> readGAlignments(bf, param = param) |> length()
[1] 3
> readGAlignments(bf, param = param) |> length()
[1] 43
> readGAlignments(bf, param = param) |> length()
[1] 0
Thank you very much for the details.
Is there any way to set the parameters so that it would read, at max, 3 reads only from the ranges defined by "which", every time the readGAlignments()
is called ?
No.
It is better to ask questions like this on the support site https://suppport.bioconductor.org
Dear developers, Hello.
I was wondering if the "yileldSize" in
BamFile()
is ignored when importing reads usingreadGAlignmentsList()
, if also "which" parameter is defined inscanBamFlag()
. In other words if we define "which" inscanBamFlag()
function,readGAlignmentsList()
will just import all the reads that have been mapped to the Genomic coordinates defined in which. Is there anyway to runreadGAlignmentsList()
so that it would import reads mapping to the coordinates defined using "which
" parameter ofscanBamFlag()
, and load at maximum only 100,000 of these reads (i.e. considering theyieldSize=100000
param setting inBamFile()
). So would this sequence of codes take the yieldSize=100000 into account and load only 100000 paired reads ?Cheers,
Ali