bioinformatics-centre / bayesembler

A Bayesian method for doing transcriptome assembly from RNA-seq data
MIT License
25 stars 5 forks source link

CIGAR String error #13

Open ashokrags opened 9 years ago

ashokrags commented 9 years ago

Hello, I am trying to run bayesembler on a bam generated by GSNAP and have run into a CIGAR string error as below:

bayesembler -s first -b ../GsnapAlignments/Gsnap.T1990_ctx.33080.cmu059.19.Aug.2014/T1990_ctx.all.mm.rg.dup.srtd.bam

You are using the Bayesembler v1.2.0. For more information go to bayesembler.binf.ku.dk

bam_nd_pe_plus_file_nameT1990_ctxallmmrgdupsrtd_nd_plus.bam [24/06/2015 16:27:56] Removing duplicate reads ERROR: Unhandled cigar string symbol 'S'!

I will be much obliged for any insight into this issue Cheers Ashok

lassemaretty commented 9 years ago

Hi Ashok,

Thank you for posting. The Byesembler currently only supports alignments generated by TopHat, however the next release (expected to appear soon) will also provide support for the STAR and HISAT aligners. If you need it, we will try to also include support for GSNAP.

Best regards,

Lasse

ashokrags commented 9 years ago

hi Lasse, Thanks so much for your quick reply. It will be great to include support for GSNAP. However, i just wanted to check with you as to why soft clipped reads are not recognized by the program. I can probably just write a wrapper to modify the bam so that soft-clipping is converted to hard clips if you think that will work.

ashokrags commented 9 years ago

Hi Lasse, FYI. I removed soft-clipped reads from the bam file I was working on. Now it appears the even hard clipped reads pose a problem. Just wanted to let you know [ar474@cn022 bayesembler]$ bayesembler -b T1990_ctx.all.mm.rg.dup.srtd.nosoft.bam -s first -f 4

You are using the Bayesembler v1.2.0. For more information go to bayesembler.binf.ku.dk

bam_nd_pe_plus_file_nameT1990_ctxallmmrgdupsrtdnosoft_nd_plus.bam [01/07/2015 13:25:09] Removing duplicate reads ERROR: Unhandled cigar string symbol 'H'!

lassemaretty commented 9 years ago

Hi,

Sorry for not getting back to you before now. Neither soft nor hard clips are currently supported. My best advice is to wait for the next release or use TopHat2 for mapping. I'm sorry for the inconvenience.

Best,

Lasse

ashokrags commented 9 years ago

hi Lasse, Not an issue. I already had a script to remove soft clip from the cigar and correct the read and quality in the bam accordingly. I didn't realize the hard clips were an issue either. I am currently removing hard clipped reads and will try to run bayesembler

lassemaretty commented 9 years ago

sounds great, let us know if you run into problems again.

ashokrags commented 9 years ago

hi Lasse, sorry to be a bother, but does the program not support multiply mapped reads? I know I don't have an issue with Cufflinks when I use the same bam, after fixing the soft clip issue. Also is there an option to not remove duplicate reads?