jiantao / Tangram

Fast Structural Variation Detection Toolbox
MIT License
18 stars 6 forks source link

TANGRAM_BAM Paired End Calls #8

Open RPSeq opened 9 years ago

RPSeq commented 9 years ago

In looking at some of my datasets, I found that Tangram calls using BWA-MEM alignments passed through the TANGRAM_BAM tool ONLY show split-read calls. Not a single paired-end read was used for any MEI calls. Is this a known issue, or perhaps functionality that has not been added to TANGRAM_BAM?

I can provide some more information if you want, but I am working with WGS alignments so can't feasibly share my input files. If you would like, I can make some sample files illustrating the issue.

Thanks,

Ryan Smith

fa8sanger commented 8 years ago

I would like to know if this issue has been solved. I am going to start an analysis with bam files aligned with BWA-MEM. I think it is very useful that Tangram supports this aligner.

Thanks, Federico

AlistairNWard commented 8 years ago

I have also observed seg faults running on bwa data and am not sure what the cause of the problem is. If you don't have massive amounts of data, I would recommend aligning with Mosaik since this is what Tangram was designed to work with. If you need any assistance, please let me know ( AlistairNWard@gmail.com) and I can help getting Mosaik alignments and tangram run. In particular, we have a pipeline system (gkno) that helps running larger pipelines and also makes it possible to build your own pipelines for running repeated / similar analyses.

On Sun, Sep 20, 2015 at 8:14 AM, Federete notifications@github.com wrote:

I would like to know if this issue has been solved. I am going to start an analysis with bam files aligned with BWA-MEM. I think it is very useful that Tangram supports this aligner.

Thanks, Federico

— Reply to this email directly or view it on GitHub https://github.com/jiantao/Tangram/issues/8#issuecomment-141790473.

fa8sanger commented 8 years ago

Thanks a lot, Alistair

The problem is that I have massive amounts of data, and these data has been already aligned. Not only a matter of computing times, also that these data occupy lots of space as BAM files, so dealing back with even larger FASTQ files would be a nightmare). I am going to do a test using bwa-mem (my data was aligned with other bwa algorithm) and see if this solves the problem (I got only MEI calls supported by split reads, as reported in the original post). I will let you know and eventually ask for your help with the gkno pipeline system (thanks!).

fa8sanger commented 8 years ago

I would like to report something that may be a problem with Tangram. Running the example that comes with Tangram, I got some strange results. There are several MEI calls that in the ALT column instead of saying INS:ME:AL/L1, includes a long DNA sequence (look at the attached image). This sequence is similar to one of the sequences found in moblist_19Feb2010_sequence_length60.fa, the file coming with the example.

slide1