jiantao / Tangram

Fast Structural Variation Detection Toolbox
MIT License
18 stars 6 forks source link

tangram_bam issue #4

Closed tk2 closed 9 years ago

tk2 commented 9 years ago

Hi Jiantao,

I was hoping to use Tangram to generate multiple TE callsets for a set of mouse strains, and combine the results.

Anyway, I got as far as tangram_bam and got this error:

~/bin/Tangram/bin/tangram_bam -i BALB_cJ.bam -r mus_elements.fa -o BALB_cJ.tangram.bam terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr Aborted

The BAM I'm using is from bwa-mem, mapped to the GRCm38 reference genome, and is here:

ftp://ftp-mouse.sanger.ac.uk/REL-1410-BAM/BALB_cJ.bam

Thanks, Thomas

kspham commented 9 years ago

Hi Thomas and Jiantao, I skimmed the code of TangramBam and found the source of the problem. This substr exception should be raised at : https://github.com/jiantao/Tangram/blob/master/src/TangramBam/tangram_bam.cpp line 668. al.ins_prefix = (index == -1) ? "" : s_ref.ref_names[index].substr(8,2);

This line are parsing the name of the TE in the ref fasta file. I think there are two solution to it.

  1. Change your TE.fasta : In the name of each TE, append 8 characters before it. Like: >ALU.ALUY should be changed to >12345678ALU.ALUY
  2. Edit the code, change 8 to 0 and recompile.

Hope it will work :) -Son.

jiantao commented 9 years ago

Hi Son,

Thank you so much for pointing out the bug.

I have update the code. This issue should be fixed now.

Jiantao