Closed Ahhgust closed 5 years ago
On second thought, it's (theoretically) possible to have both hard and soft clipping on the same end of the read. I don't know why you'd do that, but perhaps: if ( (bam_cigar_opchr(c[i]) == 'S') || (bam_cigar_opchr(c[i]) == 'H')) p += bam_cigar_oplen(c[i]);
should be
if ( (bam_cigar_opchr(c[i]) == 'S') p += bam_cigar_oplen(c[i]); else if ( bam_cigar_opchr(c[i]) != 'H') break;
Good catch. I agree that this is the right solution -- just pushed the commit.
Thanks! It looks like different flavors of the same bug are found elsewhere. Similar patches are needed at lines: 612 627 656 Thanks! -August
There's a bug in the AlignmentPosition code in the BamRecord object (code below). Hard clipping doesn't impact the start position in the read (removing the other half of the or statement should make the code correct)-- those bases are just missing from the reported read.
e.g., I have a read that looks like: 8:32868238-32874773_1151_1695_2:0:0_3:0:0_3cb 2048 1 1151 0 90H60M 0 -1 0 ACAGCTTAAAACTCAAAGGACTTGGCAGTGGTTTATATCCCTCTAGAGGAGCCTGTTCTA
For which the bamrecord AlignmentPosition() and AlignmentPositionEnd() is 90 and 60, respectively.
/* Get the start of the alignment on the read, by removing soft-clips / inline int32_t AlignmentPosition() const { uint32_t* c = bam_get_cigar(b); int32_t p = 0; for (size_t i = 0; i < b->core.n_cigar; ++i) { // Remove the code after the || if ( (bam_cigar_opchr(c[i]) == 'S') || (bam_cigar_opchr(c[i]) == 'H')) p += bam_cigar_oplen(c[i]); else // not a clip, so stop counting break; } return p; }