Closed GoogleCodeExporter closed 8 years ago
The pileup method seems to iterate through all base pairs covered by reads that
overlap the specified region.
Original comment by DarwinAw...@gmail.com
on 23 Feb 2011 at 5:29
I too have the same problem...hope I can trust pysam results..
Original comment by abhishen...@gmail.com
on 23 Mar 2011 at 5:42
You can trust the results. You just need to manually truncate the ends and fill
in zero-coverage areas.
Original comment by DarwinAw...@gmail.com
on 23 Mar 2011 at 6:28
Could you please an example how to anually truncate the ends and fill in
zero-coverage areas?
Original comment by mictadlo
on 26 Mar 2011 at 5:48
You can truncate the ends by simply skipping columns where pileupcolumn.pos is
out of your desired range. You can "fill in" zero-coverage areas by simply
reporting a zero for those positions Here is some (untested) example code:
import pysam
samfile = pysam.Samfile("../test/ex1.bam", "rb")
start = 100
end = 150
nextcol = start
for pileupcolumn in samfile.pileup( 'chr1', start, end):
# Skip columns outside desired range
if pileupcolumn.pos < start or pileupcolumn.pos > end:
continue
# Fill in zero-coverage areas
while nextcol < pileupcolumn.pos:
print
print 'coverage at base %s = %s' % (nextcol , 0)
nextcol += 1
print
print 'coverage at base %s = %s' % (pileupcolumn.pos , pileupcolumn.n)
for pileupread in pileupcolumn.pileups:
print '\tbase in read %s = %s' % (pileupread.alignment.qname, pileupread.alignment.seq[pileupread.qpos])
nextcol += 1
# Fill in zero-coverage area at the end of the region
while nextcol <= end:
print
print 'coverage at base %s = %s' % (nextcol , 0)
nextcol += 1
samfile.close()
Original comment by r...@thompsonclan.org
on 26 Mar 2011 at 6:14
Thanks for raising this issue and your answers.
This behaviour is intended and consistent with the samtools pileup command.
The pileup docstring explains this.
Bw,
Andreas
Original comment by andreas....@gmail.com
on 4 Jun 2011 at 8:24
This seems to be a feature of the c-api but surprising behaviour given the
command line samtool behaviour.
See this line in bam_plcmd.c:
if (conf->reg && (pos < beg0 || pos >= end0)) continue; // out of the region
requested
Doing this in python rather than cython code makes things unnecessary slow.
If this is not fixed, can we at least flag this issue prominently and add some
example code on how to (trivially) skip positions outside the specified
regions...
Original comment by bunbu...@gmail.com
on 3 Aug 2012 at 3:33
Original issue reported on code.google.com by
mictadlo
on 21 Feb 2011 at 12:13