Closed insectopalo closed 2 years ago
Dear Gon,
I apologize for the late reply.
Thank you for pointing this problem out. I've fixed this error. Please check the latest version.
Yours,
Mengyao
On Wed, Sep 14, 2016 at 8:19 AM, Gon S. Nido notifications@github.com wrote:
When running the C program to outuput a SAM file,
ssw_test -r region_of_interest.fa -c -s -h 3553-CT_goldenreads.fastq > alignment.sam
I've noted that the SAM file that does not comply with the SAM format specification:
"Sum of lengths of the M/I/S/=/X operations shall equal the length of SEQ" [1].
Example from actual output:
HWI-ST1309F:275:C8E2LANXX:3:1101:10013:85607 16 chrRCRS:6500-14600 2688 4 74=4I1X4=1I2X1=1X2=1D2=3I4=1X2=1I2=18S * 0 0 TACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGACCCCTATGCCTCAGGATACTCTTCAATAGCCATCGCT F7<</<B7<<<FF/FBFB/FFFB/FFFFFFFFFBFF7<F/FBFFF<BBFFFFFFFFBFFFFBFBBFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFBFFBBBBB AS:i:152 NM:i:124 ZS:i:142
The length of the sequence reported in that entry is 105:
len(TACCTGCACGACAACACATAATGACCCACCAATCACATGCCTATCATATAGTAAAACCCAGCCCATGACCCCTATGCCTCAGGATACTCTTCAATAGCCATCGCT) = 105
The CIGAR string is 74=4I1X4=1I2X1=1X2=1D2=3I4=1X2=1I2=18S which means 74+4+1+4+1+2+1+1+2+2+3+4+1+2+1+2+18=123. It seems that the soft-clipped residues are not being reported in the SEQ field.
Cheers, Gon
[1] https://samtools.github.io/hts-specs/SAMv1.pdf
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library/issues/40, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlVdNt2rizsfo-gz5OL-9vZE3KBPk0vks5qp-ZhgaJpZM4J8tTU .
When running the C program to outuput a SAM file,
I've noted that the SAM file that does not comply with the SAM format specification:
Example from actual output:
The length of the sequence reported in that entry is 105:
The CIGAR string is
74=4I1X4=1I2X1=1X2=1D2=3I4=1X2=1I2=18S
which means74+4+1+4+1+2+1+1+2+2+3+4+1+2+1+2+18=123
. It seems that the soft-clipped residues are not being reported in the SEQ field.Cheers, Gon
[1] https://samtools.github.io/hts-specs/SAMv1.pdf