Closed GoogleCodeExporter closed 9 years ago
Thanks for submitting the request. While this seems like a reasonable approach,
there are multiple issues that prevent this from working:
1. The alignments need to be sorted by genomic order, and there is no guarantee
that they would be.
2. There are some steps that are done prior to generating the coverage,
specifically pre-counting valid alignments for RPM transformation and
calculating 3' shift values in ChIPSeq data.
3. While Perl is a robust and flexible language, it is not particularly speedy,
so to eke out performance bam2wig can fork itself into separate processes, one
for each chromosome. True multi-threading on a stream would be a headache and
potentially not as fast.
4. The Perl API I am using, Bio::DB::Sam, doesn't natively work with streams.
Instead, it opens its own file handles using the bai index to work with
separate chromosomes.
5. Even working with text Sam files, something I would like to do, is a pain
and not very convenient or efficient compared to working with Bam.
6. Some (most?) of these problems could be solved by slurping all the
alignments into memory, which is what some other programs do, but I haven't
favored that due to the complexity and extreme memory requirements.
Original comment by parnell...@gmail.com
on 18 Dec 2014 at 4:33
Original issue reported on code.google.com by
balwi...@gmail.com
on 16 Dec 2014 at 6:53