Closed GoogleCodeExporter closed 8 years ago
Hi Alex,
Thank you for your comment and for your interest in GASV and GASVPro.
The GC error is definitely related to the size of your BAM file and, in my
experience, is far more likely to occur with a cancer genome than a normal
genome.
We are presently working on improvements to our BAM file processor, but in the
mean time I have some suggested modifications to your BAM file to make
processing with BAMToGASV more efficient:
(1) Separate BAM File by Chromosome:
Separate your BAM file by chromosome and run BAMToGASV on each chromosome
separated file.
(Note: To be able to correctly identify translocations, you'd need to know
which chromosome each read maps to. So you would need to sort the BAM file
first by read name and output translocations separately. Or, if you are not
interested in translocations, don't worry about the pairing and simply separate
the BAM file by chromosome.)
I would recommend running BAMToGASV on one chromosome first to obtain the
values for Lmin/Lmax and then (on subsequent chromosomes) specify the Lmin/Lmax
values in the BAMToGASV command for consistency with the results.
(2) Sorted BAM File:
I'm assuming that your BAM file contains only a single mapping for each read
and that your BAM file is possibly sorted by location.
If you sort your BAM file by read name then BAMToGASV will not need to use as
much memory to store reads (since read pairs will be adjacent in the BAM file).
----
Please let me know if these options make sense and are helpful to you. I am
glad to help with any additional questions.
Cheers,
Suzanne
Original comment by sora...@gmail.com
on 3 Sep 2013 at 6:31
Original comment by sora...@gmail.com
on 27 Feb 2014 at 2:14
Original issue reported on code.google.com by
ale.gil...@gmail.com
on 3 Sep 2013 at 1:31