jiantao / Tangram

Fast Structural Variation Detection Toolbox
MIT License
18 stars 6 forks source link

tangram_bam Segmentation fault #2

Closed RPSeq closed 9 years ago

RPSeq commented 9 years ago

I am trying to use tangram with BAM files generated from bowtie alignments.

My general workflow: Original BAM files (without ZA tags) -> tangram_bam -> sort, index -> tangram_scan

Upon calling tangram_scan on my BAM files, a seqmentation fault occurs and I have been unable to track down the exact reason. The output directory and files are generated before the seg fault, but contain no data.

My BAM files seem correct, so I suspect this might be an issue with tangram_bam formatting the ZA-tagged BAM file, or a different issue in tangram_scan.

Does tangram_bam work fully yet? I recall in the Tangram publication it was described as still under development.

Thanks

jiantao commented 9 years ago

Hi Ryan,

Could you please provide a sample bam file so that I can debug this issue?

Thank you, Jiantao

On Wed, Jan 7, 2015 at 1:53 PM, SmithRyan notifications@github.com wrote:

I am trying to use tangram with BAM files generated from bowtie alignments.

My general workflow: Original BAM files (without ZA tags) -> tangram_bam -> sort, index -> tangram_scan

Upon calling tangram_scan on my BAM files, a seqmentation fault occurs and I have been unable to track down the exact reason. The output directory and files are generated before the seg fault, but contain no data.

My BAM files seem correct, so I suspect this might be an issue with tangram_bam formatting the ZA-tagged BAM file, or a different issue in tangram_scan.

Does tangram_bam work fully yet? I recall in the Tangram publication it was described as still under development.

Thanks

— Reply to this email directly or view it on GitHub https://github.com/jiantao/Tangram/issues/2.

RPSeq commented 9 years ago

Jintao,

Thanks for the response.

Here's the alignment I was starting with:

http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.bam http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.bam.bai

My call to tangram_bam(paths edited for clarity):

tangram_bam -i /path_to_BWA_alignment/NA12878.20.bam -r path_to_special_reference/moblist_19Feb2010_sequence_length60.fa -o test-chr20-ZAtag.bam    

I used the special reference from the example data containing mobile element sequences.

My subsequent call to tangram_scan:

tangram_scan -in input.txt (only contains test-chr20-ZAtag.bam) -dir chr20_scan    

The directory chr20_scan/ is created along with the two .dat files, both of which are empty. Program then crashes with a seg fault. Looking at the ZA-tagged BAM file shows most of the ZA fields are essentially empty (no alignments found?). Looks like: ZA:Z:&;0;;;1;;><@;0;;;1;;

Thanks for the help! Let me know if I should provide more files, etc.

Ryan

jiantao commented 9 years ago

Hi Ryan,

Thank you for your feedback. I will look into this bug.

The ZA tag contain information for special reference (Mobile element reference).

Empty ZA string means there is no hit on mobile element references.

Best, Jiantao

On Wed, Jan 7, 2015 at 3:53 PM, Ryan Smith notifications@github.com wrote:

Also I noticed that most of the ZA flags in the tangram_bam output file look like this:

ZA:Z:&;0;;;1;;><@;0;;;1;;

I assume this indicates no match to any of the reference sequences?

Thanks!

Ryan

— Reply to this email directly or view it on GitHub https://github.com/jiantao/Tangram/issues/2#issuecomment-69114266.

RPSeq commented 9 years ago

Jiantao,

Could you send a template for the ZA tag format? I'd like to try doing the alignments to known MEIs myself and generate the ZA tag independently for my existing BAM files.

Thanks,

Ryan

dcow commented 9 years ago

+1

jiantao commented 9 years ago

I have updated the readme file with ZA tag information.

jiantao commented 9 years ago

Hi Ryan,

I figured out the segfault issue. Tangram needs MD5 sum in bam header for consistency check.

If you do not have that, just make a fake MD5 string in the header.

frl1 commented 9 years ago

Hi,

I was able to run tangram_bam without problem, but getting a seg_fault in tangram_scan. How can I add the MD5 string the easiest way?

Thanks,

Fritjof